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HFPATmS C ASSAY LTTIUZING RECOMBINANT ANTriGENS TO NS1 

This is a continuation-in-part application of U.S. Serial No. 07/572,822, 
filed August 24, 1990 and U.S. Serial No. 07,614,069. filed November 7, 1990, 
5 which enjoy comnrjon ownership and are incorporated herein by reference. This 
application also is related to co-filed patent applications entitled "HEPATITIS C 
ASSAY UT1U2ING RECOMBINANT ANTIGENS FROM NS5 REGION"(U. S. Serial No. 
748,565) and ''HEPATITIS C ASSAY UTILIZING RECOMBINANT ANTIGENS TO C-100 
REGiON"(U. S. Serial No. 748,566) which enjoy common ownership and are 
1 0 incorporated herein by reference. 

This invention relates generally to an assay for identifying the presence in a 
sample of an antibody which is immunologically reactive with a hepatitis G virus 
antigen and specifically to an assay for detecting a complex of an antibody and 
recombinant antigens representing distinct regions of the HCV genome. Recombinant 

1 5 antigens derived from the molecular cloning and expression in a heterologous 

expresston system of the synthetic DNA sequences representing distinct antigenic 
regions of the HCV genome can be used as reagents for the detection of antiix»dies and 
antigen in body fluids from individuals exposed to hepatitis C virus (HCV). 

20 BACKGROUND OF THE INVEfsTTION 

Acute viral hepatitis is clinically diagnosed by a well-defined set of patient 
symptoms, including jaundice, hepatic tendemess, and an increase in the serum 
levels of alanine aminotransferase (ALT) and aspartate aminotransferase. 
Additional serologic immunoassays are generally performed to diagnose the specific 

2 5 type of viral causative agent. Historically, patients presenting clinical hepatrtis 

symptoms and not otherwise infected by hepatitis A, hepatitis B. Epstein- Barr or 
cytomegalovirus were clinically diagnosed as having non-A non-B hepatitis 
(NANBH) by default. The disease may result in chronic liver damage. 

Each of the well-known, immunologically characterized hepatitis-inducing 

3 0 viruses, hepatitis A virus (HAV), hepatitis B virus (HBV), and hepatitis D virus 

(HDV) belongs to a separate family of viruses and has a distinctive viral 
organization, protein structure, and mode of replication. 

Attempts to identify the NANBH virus by virtue of genomic similarity to one 
of the known hepatitis viruses have failed, suggesting that NANBH has a distinct 
35 organization and structure. [Fowler, et al.. J. Med. Virol. . 12:205-213 (1983) 
and Weiner, et al. . J. Med. Virol. . 21:239-247 (1987)]. 

Progress in developing assays to delect antibodies specific for NANBH has 
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been particularly hampered by difficulties in correctly identifying antigens 
associated with NANBH. See. for example, Wands, J., gt aL, U.S. Patent 4,870,076, 
Wands, et a!.. Rmn. Nafl. Acad. Sci. , 53:6608-6612 (1986), Ohori. £LaL, J- MgJr 
Virol. . 12:161-178 (1983). Bradley, '^^t Pr^^, A^^^. Scl.. 84:5277- 

5 6281. (1987), Akatsuka, T., et a!. . J. Med. Virol . 20:43-56 (1986), Seto, B., £l 
SlL. U.S. Patent Application Number 07/234,641 (available from U.S. Department 
of Commerce National Technical Information Service, Springfield, Virginia, No. 
89138168). Takahashi. K., ei al. . European Patent Application No. 0 293 274, 
published November 30, 1988, and Seeiig, R., et al.. in PCT Application 

1 0 PCT/EP88/00123. 

Recently, another hepatitis-inducing virus has been unequfvocally identified 
as hepatitis C virus (HCV) by Houghton, M.. et a!.. European Patent Application 
publication number 0 318 216. May 31, 1989. Related papers describing this 
virus include Kuo, G.. et al.. Science . 244:359-361 (1989) and Choo. Q., fiL-SL 

1 5 Science. 244:362-364 (1989). Houghton, M., et al. reported isolating cDNA 

sequences from HCV which encode antigens which react imnnunologically with 
antibodies present in patients infected with NANBH, thus establishing that HCV is 
one of the viral agents causing NANBH. The cDNA sequences associated with HCV 
were isolated from a cDNA library prepared from the RNA obtained from pooled 

2 0 seaim from a chimpanzee with chronic HCV infection. The cDNA library contained 

cDNA sequences of approximate mean size of about 200 base pairs. The cDNA 
library was screened for encoded epitopes expressed in clones that could bind to 
antibodies in sera from patients who had previously experienced NANBH. 

In the European Patent Application, Houghton, M., et al. also described the 

2 5 preparation of several superoxide dismutase fusion px>lypeptides (SOD) and the use 

of these SOD fusion polypeptides to develop an HCV screening assay. The most 
complex SOD fusion polypeptide described in the European Patent Application, 
designated c100-3, was described as containing 154 amino acids of human SOD at 
the aminoterminus, 5 amino acid residues derived from the expression of a 

3 0 synthetic DNA adapter containing a restriction site, EcoRI, 363 amino acids derived 

from the expression of a cloned HCV cDNA fragment, and 5 carboxyl terminal amino y 
acids derived from an MS2 cloning vector nucleotide sequence. The DNA sequence 
encoding this polypeptide was transformed into yeast cells using a plasmid. The ^ 
transformed cells were cultured and expressed a 54,000 molecular weight 
3 5 polypeptide which was purified to about 80% purity by differential extraction. 

Other SOD fusion polypeptides designated SOD-NANB5-1.1 and SOD- 
NANB81 were expressed in recombinant bacteria. The £,C9li fusion polypeptides 
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were purified by differential extraction and by chromatography using anion and 
cation exchange columns. The purification procedures were able to produce SOD- 
NANB5--j-i as about 80% pure and SOD-NAN38, as about 50% pure. 

The recombinant SOD fusion polypeptides described by Houghton, M., et al. 
5 were coated on microtiter wells or polystyrene beads and used to assay serum 
samples. Briefly, coated microtiter wells were incubated with a sample in a 
diluent. After incubation, the microtiter wells were washed and then developed 
using either a radioactively labelled sheep anti-human antibody or a mouse 
antlhuman IgG-HRP (horseradish peroxidase) conjugate. These assays were used to 
1 0 detect both post acute phase and chronic phase HCV infection. 

Due to the preparative methods, assay specificity required adding yeast or 
E.coli extracts to the samples in order to prevent undesired immunological 
reactions with any yeast or E.coli antibodies present in samples. 

Ortho Diagnostic Systems Inc. have developed a immunoenzyme assay to 

1 5 detect antibodies to HCV antigens. The Ortho assay procedure is a three-stage test 

for serum/plasma carried out in a microweli coated with the recombinant 
yeast/hepatitis C virus SOD fusion polypeptide c100-3. 

In the first stage, a test specimen is diluted directly in the test well and 
incubated for a specified length of time. If antibodies to HCV antigens are present in 

2 0 the specimen, antigen-antibody complexes will be formed on the microweli surface, 

if no antibodies are present, complexes will not be formed and the unbound serum 
or plasma proteins will be removed in a washing step. 

In the second stage, anti-human IgG murine monoclonal antibody horseradish 
peroxidase conjugate is added to the microweli. The conjugate binds specifically to 

2 5 the antibody portion of the antigen-antibody complexes. If antigen-antibody 

complexes are not present, the unbound conjugate will also be removed by a washing 
step. 

In the third stage, an enzyme detection system composed of o- 
phenylenediamine 2HCI (OPD) and hydrogen peroxide is added to the test well. If 

3 0 bound conjugate is present, the OPD will be oxidized, resulting in a colored end 

product. After formation of the colored end product, dilute sulfuric acid is added to 
the microweli to stop the color-forming detection reaction. 

The intensity of the colored end product is measured with a microweli 
reader. The assay may be used to screen patient serum and plasma. 
3 5 It is established that HCV may be transmitted by contaminated blood and 

blood products. In transfused patients, as many as 10% will suffer from post- 
transfusion hepatitis. Of these, approximately 90% are the result of infections 
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diagnosed as HC V. The prevention of transmission of HCV by blood and blood 
products requires reiiabie, sensitive and specific diagnosis and prognostic tools to 
identify HCV carriers as well as contaminated blood and blood products. Thus, there 
exists a need for an HCV assay which uses reiiabie and efficient reagents and 
methods to accurately detect the presence of HCV antibodies in samples. 



SUMN4ARY OF T HF INV^ION 

The present invention provides an improved assay for detecting the presence 
of an antibody to an HCV antigen in a sample by contacting the sample with at least 
1 0 one recombinant protein representing a distinct antigenic region of the HCV genome. 

Recombinant antigens which are derived from the molecular cloning and 
expression of synthetic DNA sequences in heterologous hosts are provided. Briefly, 
synthetic DNA sequences which encode the desired proteins representing distinct 
antigenic regions of the HCV genome are optimized for expression in £rgP»i by 

1 5 specific codon selection. Specifically, recombinant proteins representing five 

distinct antigenic regions of NS1 of the HCV genome are described. The proteins are 
expressed as chimeric fusions with E.coll CMP-KDO synthetase (CKS) gene. The 
first protein, expressed by plasmid pHCV-77 {identified as SEQ. ID. NO. 1) 
represents amino acids 365-579 of the HCV sequence of NS1 and, based on analogy 

2 0 to the genomic organization of other flaviviruses, has been named HCV CKS-NS1S1. 

Note that the term pHCV-77 will also refer to the fusion protein itself and that 
pHGV-77* will be the designation for a polypeptide representing the MS1 region 
from about amino acids 365-579 of the HCV sequence prepared using other 
recombinant or synthetic methodologies. Other recombinant methodologies would 

2 5 include the preparation of pHCV-77\ utilizing different expression systems. The 

methodology for the preparation of synthetic peptides of HCV is described in U.S. 
Serial No. 456,162. filed December 22, 1989. and U.S. Serial No. 610,180, filed 
November 7, 1990, which enjoy common ownership and are incorporated herein by 
reference. The next protein is expressed by plasmid pHCV-55. identified as SEQ. 

3 0 ID. NO. 2, and represents amino acids 565-731 of the NS1 region of the HCV 

genome. pHCV-65 has been named HCV CKS-NS1S2 and is expressed by the plasmid 
pHGV-65. The fusion protein itself will also be referred to as pHCV-65 and pHCV- 
65' shall be the designation for a polypeptide from the NS-1 region representing 
from about amino acids 565-731 of the HCV sequence prepared using other 
3 5 recombinant or synthetic methodologies. The next recombinant antigen represents 
amino acids 717-847 of the NS1 region of the HCV sequence, and is expressed by 
the plasmid pHCV-78 (identified by SEQ. ID. NO. 3). The fusion protein will be 
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referred to as pHCV-7B and pHCV-78' shall be the designation for a polypeptide 
from the NS1 region representing from about amino acids 717-847 of the HCV 
sequence prepared using other recombinant or synthetic methodologies. It has been 

* designated done HCV CKS-NS1S3 based on the strategy used in its construction, 

5 Figure 44 illustrates the position of pHCV-77, pHCV-65 and pHCV-78 in the NS1 

• region of the HCV genome. The recombinant antigen produced by pHCV-80 is 
identified as SEQ. ID. NO. 4 and is designated HCV CKS-NS1S1-NS1S2. The fusion 
protein is also designated by pHCV-80 and pHCV-80* refers to the polypeptide 
located in the NS1 region of HCV, representing amino acids 365-731 of the HCV 

1 0 genome prepared using different recombinant methodologies. Figure 45 illustrates 
the position of pHCV-80 within the HCV genome. HCV CKS-Full Length NS1 is the 
designation for the recombinant protein pHCV-92 (SEQ. ID. NO. 5). It represents 
amino acids 365-847 of the HCV genome. The fusion proteins will be referred to as 
pHCV-92 and pHCV 92' shall be the designation for the polypeptide from the NS1 

1 5 region representing amino acids 365-847 of the HCV sequence prepared using 

other recombinant or synthetic methodologies. Figure 46 illustrates the position of 
pHCV-92 in the HCV genome. These antigens are used in the inventive 
immunoassays to detect the presence of HCV antibodies in samples. 

One assay format according to the invention provides a screening assay for 

2 0 identifying the presence of an antibody that is immunologically reactive with an HCV 

antigen. Briefly, a fluid sample is incubated with a solid support containing the 
commonly bound recombinant proteins. Finally, the antibody- antigen complex is 
detected, in a modification of the screening assay the solid support additionally 
contains recombinant polypeptide c100-3. 

2 5 Another assay format provides a confirmatory assay for unequivocally 

identifying the presence of an antibody that is immunologically reactive with an HCV 
antigen. The confirmatory assay includes synthetic peptides or recombinant 
antigens representing the epitopes contained within the NS1 region of the HCV 
genome, which are the same regions represented by the recombinant proteins 

3 0 described in the screening assay. These are pHCV-77. pHCV-65, pHCV-78, pHCV- 

80 and pHCV-92. Recombinant proteins used in the confirmatory assay should have 
a heterologous source of antigen to that used in the primary screening assay (i.e. 
should not be an Ej^oli-derived recombinant antigen nor a recombinant antigen 
composed in part, of CKS sequences). Briefly, specimens repeatedly reactive in the 
3 5 primary screening assay are retested in the confirmatory assay, Aliquots 

containing identical amounts of specimen are contacted with a synthetic peptide or 
recombinant antigen individually coated onto a solid support. Finally, the antibody- 
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antigen complex is detected. The polypeptides or recombinant proteins can be 
utilized as indicated or combined with other polypeptides and recombinant proteins 
a described herein and also described in U.S. Serial No. 456,162 entitled "Hepatitis 
C Assay", filed December 22, 1989, which enjoys common ownership and is 
5 incorporated herein by reference. 

Another assay format provides a competition assay or neutralization assay 
directed to the confirmation that positive results are not false by identifying the 
presence of an antibody that is immunologically reactive with an HCV antigen in a 
fluid sample where the sample is used to prepare first and second immunologically 
1 0 equivalent aiiquots. The first aliquot is contacted with solid support containing a 
bound polypeptide which contains at least one epitope of an HCV antigen under 
conditions suitable for complexing with the antibody to form a detectable antibody- 
polypeptide complex and the second aliquot is first contacted with the same solid 
support containing bound polypeptide. The preferred recombinant polypeptides 

1 5 include pHCV-77, pHCV-65, pHCV-78. pHCV-BO and pHCV-92. 

Another assay format provides an immunodot assay for identifying the 
presence of an antibody that is immunologically reactive with an HCV antigen by 
concurrently contacting a sample with recombinant polypeptides each containing 
distinct epitopes of an HCV antigen under conditions suitable for complexing the 

2 0 antibody v^th at least one of the polypeptides and detecting the antibodypolypeptide 

complex by reacting tfie complex with coiorproducing reagents. The preferred 
recombinant polypeptides employed include those recombinant polypeptides derived 
from pHCV-77, pHCV-65. pHCV-78. pHCV-80, as well as pHCV-92. 

In all of the assays, the sample is preferably diluted before contacting the 

2 5 polypeptide absorbed on a solid support. Samples may be obtained from different 

biological samples such as whole blood, senjm. plasma, cerebral spinal fluid, and 
lymphocyte or cell culture supernatants. Solid support materials may include 
cellulose materials, such as paper and nitrocellulose, natural and synthetic 
polymeric materials, such as polyacrylamide. polystyrene, and cotton, porous gels 

3 0 such as silica gel, agarose, dextran and gelatin, and inorganic materials such as 

deactivated alumina, magnesium sulfate and glass. Suitable solid support materials ^ 
may be used in assays in a variety of well known physical configurations, including 
microtiter wells, test tubes, beads, strips, membranes, and micropartides. A 
preferred solid support for a non-immunodot assay is a polystyrene bead. A 
3 5 preferred solid support for an immunodot assay is nitrocellulose. 

Suitable methods and reagents for detecting an antibody-antigen complex in 
an assay of the present invention are commercially available or known in the 
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relevant art. Representative methods may employ detection reagents such as 
enzymatic, radioisotopic, fluorescent, luminescent, or chemiluminescent reagents. 
These reagents may be used to prepare hapten-labelled antihapten detection systems 
according to known procedures, for example, a biotin-labelled antibiotin system 
5 may be used to detect an antibody-antigen complex. 

The present invention also encompasses assay kits including polypeptides 
which contain at least one epitope of an HCV antigen bound to a solid support as well 
as needed sample preparation reagents, wash reagents, detection reagents and signal 
producing reagents. 

1 0 Other aspects and advantages of the invention will be apparent to those 

skilled in the art upon consideration of the following detailed description which 
provides Illustrations of the invention in its presently preferred embodiments. 

E.coli strains containing plasmids useful for constructs of the invention have 
been deposited at the American Type Culture Collection. Rockville, Maryland on 

1 5 August 10, 1990. under the accession Nos. ATCC 68380 (pHCV-23), ATCC 68381 

(pHCV-29). ATCC 68382 {pHCV-31). ATCC 68383 (pHCV-34) and on November 
6, 1 990 for E.coli strains containing plasmids useful for constructs under the 
accession Nos. ATCC 68458 {pHCV-50), ATCC 68459 (pHCV-57), ATCC 68460 
{pHCV-103), ATCC 68461 (pHCV-102), ATCC 68462 (pHCV-51), ATCC 68463 
20 (pHCV-105), ATCC 68464 (pHCV-107). ATCC 68465 (pHCV-104). ATCC 68466 
(pHCV-45), ATCC 68467 (pHCV-48),ATCC 68468 (pHCV-49), ATCC 68469 
(pHCV-58) and ATCC 68470 {pHCV-101). E. coli strains containing plasmids 
useful for constructs of the invention have been deposited at the A.T.C.C. on 
September 26, 1991 under deposit numbers ATCC 68690 {pHCV-77), ATCC 

2 5 68696 (pHCV-65), ATCC 68689 (pHCV-78), ATCC 68688 {pHCV-80) and ATCC 

68695 (pHCV-92). 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 illustrates the HCV genome. 

3 0 FIGURE 2 illustrates the use of recombinant polypeptides to identify the 

presence of antibodies in a chimpanzee inoculated with HCV. 

FIGURE 3 illustrates the sensitivity and specificity increase in using the 
screening assay using pHCV-34 and pHCV-31 antigens. 

FIGURE 4 illustrates the construction of piasmid pHCV-34. 
3 5 FIGURE 5 illustrates fusion protein pHCV-34. 

FIGURE 6 illustrates the expression of pHCV-34 proteins in E.coli. 

FIGURE 7 Illustrates the construction of piasmid pHCV-23. 
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FIGURE 8 illustrates the construction of plasmid pHCV-29. 
FIGURE 9 illustrates the construction of plasmid pHCV-31. 
FIGURE 10 illustrates the fusion protein pHCV-31. 
FIGURE 11 illustrates the expression of pHCV-29 in E^gQlf- 
5 FIGURE 12 illustrates the expression of pHCV-23 in E,gg|i. 

FIGURE 13 illustrates the expression of pHCV-31 in 
FIGURE 14 illustrates the increased sensitivity using the screening assay 
utilizing the pHCV-34. 

FIGURE 15 illustrates the increased specrficity with the screening assay 

1 0 utilizing pHCV-34 and pHCV-31. 

FIGURE 16 illustrates the results in hemodiaiysis patients using the 
screening and confirmatory assays. 

FIGURE 17 illustrates earlier detection of HCV in a hemodialysis patient 
using the screening assay. 

1 5 FIGURE 18 illustrates the results of the screening assay utilizing pHCV-34 

and pHCV-31 on samples from individuals with acute NANBH. 

FIGURE 19 illustrates the results of the confirmatory assay of the same 
population group as in Figure 18. 

FIGURE 20 illustrates the results of the screening and confinriatory assays 

2 0 on individuals infected with chronic NANBH. 

FIGURE 21 illustrates prefen-ed buffers, pH conditions, and spotting 
concentrations for the HCV immunodot assay. 

FIGURE 22 illustrates the results of the HCV immunodot assay. 
FIGURE 23 illustrates the fusion protein pHCV-45. 

2 5 FIGURE 24 illustrates the expression of pHCV-45 in E.Cpli. 

FIGURE 25 illustrates the fusion protein pHCV-48. 
FIGURE 26 illustrates the- expression of pHGV-48 in E.ggti. 
FIGURE 27 illustrates the fusion protein pHCV-51. 
FIGURE 28 illustrates the expression of pHCV-51 in £j22il- 

3 0 FIGURE 29 illustrates the fusion protein pHCV-50. 

FIGURE 30 illustrates the expression of pHCV-50 in 
FIGURE 31 illustrates the fusion protein pHCV-49. 
FIGURE 32 illustrates the expression of pHCV-49 in E.CQfi. 
FIGURE 33 illustrates an immunobiot of pHCV-23, pHCV-45, pHCV-48, 
3 5 pHCV-51, pHCV-50 and pHCV-49. 

FIGURE 34 illustrates the fusion proteins pHCV-24, pHCV-57, pHCV-58. 
FIGURE 35 illustrates the expression of pHGV-24, pHCV-57, and pHCV-58 
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in E.coli. 

FIGURE 36 illustrates the fusion protein pHCV-105. 
FIGURE 37 illustrates the expression of pHCV-105 in E.coli . 

FIGURE 38 illustrates the fusion protein pHCV-1 03. 
5 FIGURE 39 illustrates the fusion protein pHCV-101. 

FIGURE 40 illustrates the fusion protein pHCV-102. 

FIGURE 41 illustrates the expression of pHCV-102 in 

FIGURE 42 illustrates the fusion protein pHCV-107. 

FIGURE 43 illustrates the fusion protein pHCV-"I04. 
1 0 FIGURE 44 illustrates the NS1 region of the HCV genome, and in particular, 

the locations of pHCV-77, pHCV-65 and pHCV-7S. 

FIGURE 45 illustrates the NS1 region of the HCV genome, and in particular, 
the location of pHCV-80. 

FIGURE 46 illustrates the NS1 region of the HCV genome, and in particlar, 

1 5 the location of pHCV-92. 

FIGURE 47A llustrates the expression of pHCV-77 in E. coii : and RGURE 
47B illustrates an immunblot of pHCV-77 in E. coii . 

FIGURE 48A illustrates the expression of pHCV-65 in E. coii and HGURE 
48B illustrates an immunoblot of pHCV-65 in E. coii . 

2 0 FIGURE 49A illustrates the expression of pHCV-80 in E. call and FIGURE 

49B illustrates an immunoblot of pHCV-80 in E. coii . 

DETAILED DES CRIPTION OF THE INVEmiON 

The present invention is directed to an assay to detect an antibody to an HCV 

2 5 antigen in a sample. Human serum or plasma is preferably diluted in a sample 

diluent and incubated with a polystyrene bead coated with a recombinant polypeptide 
that represents a distinct antigenic region of the HCV genome, if antibodies are 
present in the sample they will form a complex with the antigenic polypeptide and 
become affixed to the polystyrene bead. After the complex has formed, unbound 

3 0 materials and reagents are removed by washing the bead and the bead-antigen- 

antibody complex is reacted with a solution containing horseradish peroxidase 
labeled goat antibodies directed against human antibodies. This peroxidase enzyme 
then binds to the antigen -antibody complex already fixed to the bead. In a final 
reaction the horseradish peroxidase is contacted with o-phenylenediamine and 
3 5 hydrogen peroxide which results in a yellow-orange color. The intensity of the 
color is proportional to the amount of antibody which initially binds to the antigen 
fixed to the bead. 
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The preferred recombinant polypeptides having HCV antigenic epitopes were 
selected from portions of the HCV genonne which encoded polypeptides which 
possessed amino acid sequences similar to other known immunologically reactive 
agents and which were identified as having some immunological reactivity. (The 
5 immunological reactivity of a polypeptide was initially identified by reacting the 
cellular extract of E.coli clones which had been transformed with cDNA fragments of 
the HCV genome with HCV infected senjm. Polypeptides expressed by clone 
containing the incorporated cDNA were immunologically reactive with serum known 
to contain antibody to HCV antigens.) An analysis of a given amino acid sequence, 

1 0 however, only provides rough guides to predicting immunological reactivity. There 

is no invariably predictable way to ensure immunological activity short of 
preparing a given amino acid sequence and testing tiie suspected sequence in an 
assay. 

The use of recombinant polypeptides representing distinct antigenic regions 
15 of the HCV genome to detect the presence of an anttoody to an HCV antigen is 

illustrated in Figure 2. The course of HCV infection In the chimpanzee. Pan, was 
followed with one assay using recombinant clOO-3 polypeptide and wrth another 
improved assay, using the two recombinant antigens CKS-Core (pHCV-34) 
(SEQ.ID.no 5 and 7) and pHCV-33c-BCD (pHCV-31) (SEQ.ID.NO 8 and 9) 

2 0 expressed by tine plasmids pHCV-34 and pHCV-31, respectively. The assay 

• utilizing the recombinant pHCV-34 and pHCV-31 proteins detected plasma antibody 
three weeks prior to detection of antibody by the assay using c100-3. 

A summary of the results of a study which followed the course of HCV 
infection in Pan and six other chimpanzees using tiie two assays described above is 

2 5 summarized in Figure 3. Both assays gave negative results before inoculation and 

both assays detected the presence of antibodies after the animal had been infected 
with HCV. However, in the comparison of the two assays, the improved screening 
assay using pHCV-34 and pHCV-31 detected seroconversion to HCV antigens at an 
earlier or equivalent bleed date In six of the seven chimpanzees. Data from these 

3 0 chimpanzee studies clearly demonstrate that overall detection of HCV antibodies is 

greatly increased with the assay utilizing the pHCV-34 and pHCV-31 proteins. 
This test is sufficiently sensitive to detect seroconversion during the acute phase of 
tills disease, as defined as an elevation in ALT levels, in most animals. Equally 
important is the high degree of specificity of the test as no pre-inoculation 
3 5 specimens were reactive. 

The polypeptides useful in the practice of this invention are produced using 
recombinant technologies. The DNA sequences which encode the desired polypeptides 
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are preferably assembled from fragments of the total desired sequence. Synthetic 
DNA fragments of the HCV genome can be synthesized based on their corresponding 
amino acid sequences. Once the amino acid sequence is chosen, this is then reverse 
translated to determine the complementary DNA sequence using codons optimized to 
5 facilitate expression in the chosen system. The fragments are generally prepared 
using well known automated processes and apparatus. After the complete sequence 
has been prepared the desired sequence is incorporated into an expression vector 
which is transformed into a host cell. The DNA sequence is then expressed by the 
host cell to give the desired polypeptide which is harvested from the host cell or 
1 0 from the medium in which the host cell is cultured. When smaller peptides are to 
be made using recombinant technologies it may be advantageous to prepare a single 
DNA sequence which encodes several copies of the desired polypeptide in a connected 
chain. The long chain is then isolated and the chain is cleaved into the shorter, 
desired sequences, 

1 5 The methodology of polymerase chain reaction (PGR) may also be employed 

to develop PGR amplified genes from any portion of the HGV genome, which in turn 
may then be cloned and expressed in a manner similar to the synthetic genes. 

Vector systems which can be used include plant, bacterial, yeast, insect, and 
mammalian expression systems. It is preferred that the codons are optimized for 

2 0 expression In the system used. 

A preferred expression system utilizes a carrier gene for a fusion system 
where the recombinant HGV proteins are expressed as a fusion protein of an E.coti 
enzyme, CKS (GTP:GMP-3-deoxv- manno -octuiosonate cytidyiyi transferase or 
CMP-KDO synthetase). The CKS method of protein synthesis is disclosed in U.S. 

2 5 Patent Applications Serial Nos. 167.067 and 276,263 filed March 11. 1988 and 

November 23, 1988, respectively, by Boiling (EPO 891029282) which enjoy 
common ownership and are incorporated herein by reference. 

Other expression syslems may be utilized including the lambda PL vector 
system whose features include a strong lambda pL promoter, a strong three-frame 

3 0 translation terminator rrnBtl, and translation starting at an ATG codon. 

In the present invention, the amino acid sequences encoding for the 
recombinant HGV antigens of interest were reverse translated using codons 
optimized to facilitate high level expression in E.coii . Individual oligonucleotides 
were synthesized by the method of oligonucleotide directed double-stranded break 
3 5 repair disclosed in U.S. Patent Application Serial No. 883,242. filed July 8. 1986 
by Mandecki (EPO 87109357.1) which enjoys common ownership and is 
incorporated herein by reference. Alternatively, the individual oligonucleotides 
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may be synthesized on the Applied Biosystem 380A DNA synthesizer using methods 
and reagents recommended by the manufacturer. The DNA sequences of the 
individual oligonucleotides were confirmed using the Sanger dideoxy chain 
termination method (Sanger et al.. -1 Mf>le. Biol.. 162:729 (1982)). These 
5 individual gene fragments were then annealed and ligated together and cloned as 
EcoRI-BamHI subfragments in the CKS fusion vector pJO200. After subsequent 
DNA sequence confimnation by the Sanger dideoxy chain termination method, the 
subfragments were digested with appropriate restriction enzymes, gel purified, 
ligated and cloned again as an EcoRl-BamHI fragment in the CKS fusion vector 
1 0 pJ0200. The resulting clones were mapped to identify a hybrid gene consisting of 
the EcoRi-BamH! HCV fragmeni inserted at the 3" end of the CKS (CMP-KDO 
synthetase) gene. The resultant fusion proteins, under control of the lac promoter, 
consist of 239 amino acids of the CKS protein fused to the various regions of HCV. 

The synthesis, cloning, and characterization of the recombinant polypeptides 

1 5 as well as the preferred formats for assays using these polypeptides are provided in 

the following examples. Examples 1 and 2 describe the synthesis and cloning of 
GKS-Core and CKS-33-BCD, respectively. Example 3 describes a screening assay. 
Example 4 describes a confirmatory assay. Example 5 describes a competition 
assay. Example 6 describes an immunodot assay. Example 7 describes the 

2 0 synthesis and cloning of HCV CKS-NS5E. CKS-NS5F, CKS-NS5G. CKS-NS5H and 

CKS-NS51. Example 8 describes the preparation of HCV CKS-C100 vectors. 
Example 9 describes the preparation of HCV PCR derived expression vectors. 
Example 1 0 describes the synthesis and characterization of pHCV-77 of NS1 . 
Example 1 1 describes the synthesis and characterization of pHCV-65 of NS1 . 

2 5 Example 1 2 describes the synthesis and characterization of pHCV-7B of NS1 . 

Example 13 describes the synthesis and characterization of pHCV-80 of NS1. 
Example 14 describes the synthesis and characterization of pHCV-92 of NS1. 

RgAGENTTS AMD FN7YMES 

3 0 Media such as Luria-Bertani (l_B) and Superbroth II (Dri Form) were 

obtained from Gibco Laboratories Ufe Technologies. Inc., Madison Wisconsin. 
Restriction enzymes, Klenow fragment of DNA polymerase I, T4 DNA ligase, T4 
polynucleotide kinase, nucleic acid molecular weight standards, M13 sequencing 
system, X-gal (5-bromo-4-chloro-3-indonyl-B-D-galactoside), IPTG 
3 5 (isopropyl-B-D-thiogaiactoside), glycerol, Dithiothreitol, 4-chloro-1 -naphthol 
were purchased from Boehringer Mannheim Biochemicals, Indianapolis, Indiana; or 
New England Biolabs, Inc., Beverly, Massachusetts; or Bethesda Research 
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Laboratories Life Technologies, Inc., Gaithersburg, Maryland. Prestained protein 
molecuiar weight standards, acrylamide (crystallized, electrophoretic grade 
>99%); N-N'-Methylene-bis-acrylamide (BIS); N,N.N',N',- 
Tetramethylethylenediamine (TEMED) and sodium dodecylsulfate (SDS) were 
5 purchased from BioRad Laboratories. Richmond, California. Lysozyme and 
ampiciliin were obtained from Sigma Chemical Co., St Louis, Missouri. 
Horseradish peroxidase (HRPO) labeled secondary antibodies were obtained from 
Kirkegaard & Perry Laboratories, Inc., Gaithersburg, Maryland. Seaplaque® 
agarose (low melting agarose) was purchased from FMC Bioproducts. Rockland, 
1 0 Maine. 

T50E1 0 contained 50mM Tris, pH 8.0. lOmM EDTA; 1X TG contained lOOmM 
Tris, pH 7.5 and 10% glycerol; 2X SDS/PAGE loading buffer consisted of 15% 
glycerol, 5% SDS, lOOmM Tris base. 1M B-mercaptoethanol and 0.8% 
Bromophenol blue dye; TBS container 50 mM Tris, pH 8.0, and 150 mM sodium 

1 5 chloride; Blocking solution consisted of 5% Carnation nonfat dry milk in TBS. 

HOST CELL CULTURES. DMA SQl IRCE^ AND VECTORS 
E.coli JM103 cells, pUC8, pUCIS, pUC19 and Ml 3 cloning vectors were 
purchased from Pharmacia LKB Biotechnology, Inc., Piscataway, New Jersey; 

2 0 Competent Epicurean™ coli stains XL1-Blue and JM109 were purchased from 

Stratagene Cloning Systems, LaJoila, California. RR1 cells were obtained from Coli 
Genetic Stock Center, Yale University, New Haven. Connecticut; and E-coli CAG456 
cells from Dr. Carol Gross, University of Wisconsin, Madison. V^Tisconsin. Vector 
pRK248.clts was obtained from Dr. Donald R. Helinski, University of California, 

2 5 San Diego, California. 

GENERAL MEimOS 

All restriction enzyme digestion were performed according to suppliers' 
instructions. At least 5 units of enzyme were used per microgram of DNA, and 

3 0 sufficient incubation was allowed to complete digestion of DNA. Standard procedures 

were used for minicell lysate DNA preparation, phenol-chloroform extraction, 
ethanol precipitation of DNA, restriction analysis of DNA on agarose, and low 
melting agarose gel purification of DNA fragments (Maniatis et al., Molecular 
Clonino . A Laboratory Manual [New York: Cold Spring Harbor, 1982]). Piasmid 
3 5 isolations from E.coli strains used the alkali lysis procedure and cesium chloride- 
ethidium bromide density gradient method (Maniatis et al., supra). Standard 
buffers were used for T4 DNA ligase and T4 polynucleotide kinase (Maniatis et al., 
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supra). 

FXAMPI F 1 CKS-CORE 

A. Construction of the Plasm td dJ0200 

5 The cloning vector pJO200 allows the fusion of recombinant proteins to the 

CKS protein. The piasmid consists of the plasmid pBR322 with a modified lac 
promoter fused to a KdsB gene fragment (encoding the first 239 of the entire 248 
amino acids of the E.coli CMP-KDO synthetase of CKS protein), and a synthetic 
linker fused to the end of the KdsB gene fragment. The cloning vector pJO200 is a 
1 0 modification of vector pTB210. The synthetic linker Includes: multiple restriction 
sites for insertion of genes; translattonal stop signals, and the trpA rho- 
indeoendent transcriptional terminator. The CKS method of protein synthesis as 
well as CKS vectors including pTB210 are disclosed in U.S. Patent Application 
Serial Nos. 167,067 and 276,263, filed March 11, 1988 and November 23. 

1 5 1988. respectively, by Boiling (EPO 891029282) which enjoy common 

ownership, and are herein incorporated by reference. 

B. Preparation of HCV CKS-Core Expression Vector 

Six individual nucleotides representing amino acids 1-150 of the HCV 

2 0 genome were ligated together and cloned as a 466 base pair EcoRl-BamHi fragment 

into the CKS fusion vector pJO200 as presented in Figure 4. The complete DNA 
sequence of this plasmid. designated pHCV-34. and the entire amino acid sequence of 
the pHCV-34 recombinant antigen produced is presented in SEQ.ID.NO 6 and 7. The 
resultant fusion protein HCV CKS-Core. consists of 239 amino acids of CKS, seven 

2 5 amino acids continbuted by linker DNA sequences, and the first 150 amino acids of 

HCV as illustrated in Figure 5. 

The pHCV-34 plasmid and the CKS piasmid pTB210 were transformed into 
E.coli K-12 strain xL-1 (recAl, endAl. gyrASS, thi-1. hsdRI7, supE44, relAi, 
lac/F. proAB, laclqZDMIS. TNIO) cells made competent by the calcium chloride 

3 0 method. In these constructions the expression of the CKS fusion proteins was under 

the control of the iac promoter and was induced by the addition of IPTQ. These * 
plasmids replicated as independent elements, were nonmobilizable and were 
maintained at approximately 10-30 copies per cell. 
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C. Characterization of Recombinant HCV-Core 

In order to establish that clone pHCV-34 expressed the unique HCV-CKS 
Core protein, the pHCV-34/XL-1 culture was grown overnight at 37°C in growth 
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media consisting of yeast extract, trytone, phosphate salts, glucose, and ampicillin. 
When the culture reached an OD600 of 1.0, IPTG was added to a final concentration 
of 1mM to induce expression. Samples (1.5 ml) were removed at 1 hour intervals, 
and cells were pelleted and resuspended to an OD600 of 1 .0 in 2X SDS/PAGE loading 
5 buffer. Aliquots (15ul) of the prepared samples were separated on duplicate 
12.5% SDS/PAGE gels. 

One gel was fixed in a solution of 50% methanol and 10% acetic acid for 20 
minutes at room temperature, and then stained with 0.25% Coomassie blue dye in a 
solution of 50% methanol and 10% acetic acid for 30 minutes. Destaining was 

1 0 carried out using a solution of 10% methanol and 7% acetic acid for 3-4 hours, or 

until a clear background was obtained. 

Figure 6 presents the expression of pHCV-34 proteins in E.cofi. Molecular 
weight standards were run in Lane M. Lane 1 contains the plasmid pJ0200-the CKS 
vector without the HCV sequence. The arrows on the left indicate the mobilities of 
15 the molecular weight markers from top to bottom: 110,000; 84,000; 47,000; 
33,000; 24,000; and 15,000 daltons. The arrows on the right indicate the 
mobilities of the recombinant HCV proteins. Lane 2 contains the E.coli tysate 
containing pHCV-34 expressing CKS-Core (amino acids 1 to 150) prior to 
induction; and Lane 3 after 3 hours of induction. The results show that the 

2 0 recombinant protein pHCV-34 has an apparent mobility corresponding to a 

molecular size of 48,000 daltons. This compares acceptably with the predicted 
molecular mass of 43.750 daltons. 

Proteins from the second 12-5% SDS/PAGE gel were electrophoretically 
transferred to nitrocellulose for immunobiotting. The nitrocellulose sheet 

2 5 containing the transferred proteins was incubated with Blocking Solution for one 

hour and incubated overnight at A'^C with HCV patients' sera diluted in TBS 
containing E.coli K-12 strain XL-i lysate. The nitrocellulose sheet was washed 
three times in TBS, then incubated with HRPO-labeled goat anti-human IgG, diluted 
in TBS containing 1 0% fetal calf sera. The nitrocellulose was washed three times 

3 0 with TBS and the color was developed in TBS containing 2 mg/ml 4-chloro-1- 

napthol, 0.02% hydrogen peroxide and 17% methanol. Clone HCV-34 demonstrated 
a strong immunoreactive band at 48,000 daltons with the HCV patients' sera. Thus, 
the major protein in the Coomassie stained protein gel was immunoreactive. 
Normal human serum did not react with any component of pHCV-34. 
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PXAMPLE 2. HCV CKS-33C-BCD 
A. Preparation of HCV CKS-33c-BCD Expression Vector 
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The construction of this recombinant clone expressing the HCV CKS-33-BCD 
antigen was carried out in three steps described below. First, a cior>e expressing 
the HCV CKS-BCD antigen was constructed, designated pHCV-23. Second, a clone 
expressing the HCV CKS-33 antigen was constructed, designated pHCV-29. Lastly, 
5 the HCV BCD region was excised from pHCV-23 and inserted into pHCV-29 to 
construct a cione expressing the HCV CKS-33-BCD antigen, designated pHCV-31 
(SEQ.ID.NO. 8 and 9). 

To construct the plasmid pHCV-23, thirteen individual oligonucleotides 
representing amino acids 1676-1931 of the HCV genome were iigated together and 
1 0 cloned as three separate EcoRl-BamHI subfragments into the CKS fusion vector 
pJO200. After subsequent DNA sequence confirmation, the three subfragments, 
designated B, C, and D respectively, were digested with the appropriate restriction 
enzymes, gel purified, iigated together, and cloned as a 781 base pair EcoRI-BamHI 
fragment in the CKS fusion vector pJO200, as illustrated in Figure 7. The 

1 5 resulting plasmid. designated pHCV-23, expresses the HCV CKS-BCD antigen under 

control of the iac promoter. The HCV CKS-BCD antigen consists of 239 amino acids 
of CKS. seven amino acids contributed by linker DNA sequences, 256 amino acids 
from the HCV NS4 region (amino acids 1676-1931, and 10 additional amino acids 
contributed by linker DNA sequences. 

2 0 To construct the plasmid pHCV-29 twelve individual oligonucleotides 

representing amino acids 1192-1457 of the HCV genome were Iigated together and 
cloned as two separate EcoRl-BamHI subfragments in the CKS fusion vector 
pJO200. After subsequent DNA sequence confirmation, the two subfragments were 
digested wrth the appropriate restriction enzymes, gel purified, Iigated together and 

2 5 cloned again as an 816 base pair EcoRl-BamHI fragment in the CKS fusion vector 

pJO200, as illustrated in Figure 8. The resulting plasmid. designated pHCV-29, 
expresses the CKS-33 antigen under control of the iac promoter. The HCV CKS-33 
antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker 
DNA sequences, and 266 amino acids from the HCV NS3 region (amino acids 1192- 

3 0 1457). 

To construct the plasmid pHCV-31, the 781 base pair EcoRI-BamHI 
fragment from pHCV-23 representing the HCV-BCD region was linker-adapted to 
produce a Clal-BamHI fragment which was then gel purified and iigated into pHCV- ' 
29 at the Clal-BamHI sites as illustrated in Figure 9. The resulting plasmid, 
3 5 designated pHCV-31, expresses the pHCV-31 antigen under control of the jac 
promoter. The complete DNA sequence of pHCV-31 and the entire amino acid 
sequence of the HCV CKS-33-BCD recombinant antigen produced is presented in 
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SEQ.ID.NO. 8 and 9. The HCV CKS-33-BCD antigen consists of 239 amino acids of 
CKS, eight amino acids contributed by linker DNA sequences, 266 amino acids of the 
HCV NS3 region (amino acids 1192-1457), 2 amino acids contributed by linker 
DNA sequences, 256 amino acids of the HCV NS4 region (amino acids 1676-1931), 
5 and 10 additional amino acids contributed by linker DNA sequences. Figure 12 
presents a schematic representation of the pHCV-31 antigen. 

The pHCV-31 piasmid was transformed into E.coli K-12 strain XL-i in a 
manner similar to the pHCV-34 and CKS-pTB210 plasmids of Example 1. 

1 0 B. Characterization of Recombinant HCV CKS-33-BCD 

Characterization of pHCV CKS-33-BCD was carried out in a manner similar 
to pHCV CKS-Core of Example 1 . pHCV-23, pHCV SDS/PAGE gels were run for 
E.coli lysates containing the plasmids pHCV-29 (Figure 11). pHCV-23 (Figure 
12). and pHCV-31 (Figure 13) expressing the recombinant fusion proteins CKS- 

1 5 33c. CKS-BCD. and CKS-33-BCD, respectively. For all three figures, molecular 

weight standards were run in Lane M, with the arrows on the left indicating 
mobilities of the molecular weight markers the from top to bottom: 110,000; 
84,000; 47,000; 33.000; 24,000; and 16,000 daltons. In Figure 11. Lane 1 
contained the E.coli lysate containing pHCV-29 expressing HCV CKS-33c (amino 

2 0 acids 1192 to 1457) prior to induction and lane 2 after 4 hours induction. These 

results show that the recombinant pHCV-29 fusion protein has an apparent 
mobility corresponding to a molecular size of 60,000 daltons. This compares 
acceptably to the predicted molecular mass of 54,911. 

In Figure 12, Lane 1 contained the E.coli lysate containing pJO200- the 

2 5 CKS vector without the HCV sequence. Lane 2, contained pHCV-20 expressing the 

HCV CKS-B (amino acids 1676 to 1790). Lane 3, contained the fusion protein 
pHCV-23 (amino acids 1676-1931). These results show that the recombinant 
pHCV-23 fusion protein has an apparent mobility corresponding to a molecular size 
of 55,000 daltons. This compares acceptably to the predicted molecular mass of 

3 0 55,070 daltons. . 

In Figure 13, Lane 1 contained the E.coli lysate containing pJO200 the CKS 
vector without the HCV sequences. Lane 2 contained pHCV-31 expressing the CKS- 
33C-BCD fusion protein (amino acids 1192 to 1447 and 1676 to 1931) prior to 
induction and lane 3 after 2 hours induction. These results show that the 
3 5 recombinant pHCV-31 (CKS-33c-BCD) fusion protein has an apparent mobility 
corresponding to a molecular size of 90,000 daltons. This compares acceptably to 
the predicted molecular mass of 82,995 daltons. 
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An immunoblot was also run on one of the SDS/PAGE gels derived from the 
pHCV-31/X1-1 culture. Human serum from an HCV exposed individual reacted 
strongly with the major pHCV-31 band at 90,000 daltons. Normal human serum 
did not react with any component of the pHCV-31 (CKS-33-BCD) preparations. 

5 

FXAMPf F .? .q^REe^JING ASSAY 

The use of recombinant polypeptides which contain epitopes within clOO-3 
as well as epitopes from other antigenic regions from the HCV genonne, provide 
immunological assays which have increased sensitivity and may be more specific 
1 0 than HCV immunological assays using epitopes within c100-3 alone. 

in the presently preferred screening assay, the procedure uses two EtCQIi 
expressed recombinant proteins, CKS-Core (pHCV-34) and CKS-33-BCD (pHCV- 
31), representing three distinct regions of the HCV genome. These recombinant 
polypeptides were prepared following procedures described above. In the screening 

1 5 assay, both recombinant antigens are coated onto the same polystyrene bead. In a 

modification of the screening assay the polystyrene bead may also be coated with the 
SOD-fusion polypeptide c100-3. 

The polystyrene beads are first washed with distilled water and propanol and 
then incubated with a solution containing recombinant pHCV-31 diluted to 0.5 to 

2 0 2.0 ug/ml and pHCV-34 diluted to 0.1 to 0.5 ug/ml in 0.1 M NaH2PO4-H20 with 

0.4M NaC1 and 0.0022% Triton X-100, pH 6.5, The beads are incubated in the 
antigen solution for 2 hours (plus or minus 10 minutes) at 38-42*'C, washed in 
PBS and soaked in 0.1% (w/v) Triton X-100 in PBS for 60 minutes at 38-42°C. 
The beads are then washed two times in phosphate buffered saline (PBS), overcoated 

2 5 with a solution of 5.0% (w/v) bovine serum albumin (BSA) in PBS for 60 minutes 

at 38-42°C and washed one time in PBS. Finally, the beads are overcoated with 5% 
(w/v) sucrose in PBS, and dried under nitrogen or air. 

The polystyrene beads coated with pHCV-31 and pHCV-34 are used in an 
antibody capture fomnat. Ten microliters of sample are added to the wells of the 

3 0 reaction tray along with 400 ul of a sample diluent and the recombinant coated bead. 

The sample diluent consists of 10% (v/v) bovine serum and 20% (v/v) goat serum 
in 20 mM Tris phosphate buffer containing 0.15% (v/v) Triton X-100, 1%(w/v} 
BSA. 1% E.coli lysate and 500 ug/ml or less CKS lysate. When the recombinant 
yeast C100-3 polypeptide is used, antibodies to yeast antigens which may be 
3 5 present in a sample are reacted with yeast extracts which are added to the sample 
diluent (typically about 200 ug/ml). The addition of yeast extracts to the sample 
diluent is used to prevent false positive results. The final material is sterile 
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filtered and filled in plastic bottles, and preserved with 0.1% sodium azide. 

After one hour of incubation at 40^*0. the beads are washed and 200 ul of 
conjugate is added to the wells of the reaction tray. 

The preferred conjugate is goat anti-human IgG horseradish peroxidase 
5 conjugate. Concentrated conjugate is titered to determine a working concentration. 
A twenty-fold concentrate of the working conjugate solution is then prepared by 
diluting the concentrate in diluent The 20X concentrate is sterile filtered and 
stored in plastic bottles. 

The conjugate diluent includes 10% (v/v) bovine serum, 10% (v/v) goat 
1 0 serum and 0.15% Triton-X100 in 20 mM Tris buffer. pH 7.5 with 0.01% 

gentamicin sulfate, 0.01% thimerosa! and red dye. The conjugate is sterile filtered 
and filled in plastic bottles. 

Anti-HCV positive control is prepared from plasma units positive for 
antibodies to HCV. The pool of units used includes plasma with antibodies reactive to 

1 5 pHCV-31 and pHCV-34. The units are recalcified and heat inactivated at 59-61 '='0 

for 12 hours with constant stirring. The pool is aliquoted and stored at -20'*C or at 
2-8'*C. For each lot of positive control, the stock solution is diluted with negative 
control containing 0.1% sodium azide as a preservative. The final material is 
sterile filtered and filled in plastic bottles. 

2 0 Anti-HCV negative control is prepared from recalcified human plasma, 

negative for antibodies to pHCV-31 and pHCV-34 proteins of HCV. The plasma is 
also negative for antibodies to human immunodeficiency virus (HIV) and negative 
for hepatitis B surface antigen (HBsAg). The units are pooled, and 0.1% sodium 
azide is added as a preservative. The final material is sterile filtered and filled in 

2 5 plastic bottles. 

After one hour of incubation with the conjugate at 40^*0, the beads are 
washed, exposed to the OPD substrate for thirty minutes at room temperature and 
the reaction terminated by the addition of 1 N H2SO4. The absorbance is read at 

492 nm. 

3 0 In order to maintain acceptable specificity, the cutoff for the assay should be 

at least 5-7 standard deviations above the absorbance value of the normal 
population mean. In addition, it has generally been observed that acceptable 
specificity is obtained when the population mean runs at a sample to cutoff (S/CO) 
value of 0.25 or less. Consistent with these criteria, a "preclinical" cutoff for the 
3 5 screening assay was selected which clearly separated most of the presumed "true 
negative" from "true positive" specimens. The cutoff value was calculated as the 
sum of the positive control mean absorbance value multiplied by 0.25 and the 
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negative control mean absorbance value. The cutoff may be expressed algebraically 
as: 

Cutoff value=0.25 PCx + NCx. 
Testing may be performed by tvs/o nnethods whicli differ primarily in the 
5 degree of automation and the mechanism for reading the resulting color development 
in the assay. One method is referred to as the manual or Quantum™ method because 
Quantum or Quantumatic is used to read absorbance at 492 nm. It is also called the 
manual method because sample pipetting, washing and reagent additions are 
generally done manually by the technician, using appropriately calibrated pipettes, 
1 0 dispensers and wash instnjments. The second method is referred to as the PPC 
method and utilizes the automated Abbott Commander® system. This system 
employs a pipetting device refen-ed to as the Sample Management Center (SMC) and 
a wash/dispense/read device refen-ed to as the Parallel Processing Center (PPC) 
disclosed in E.P.O. Pubiication No. 91 114072.1. The optical reader used in the PPC 

1 5 has dual wavelength capabilities that can measure differential absorbencies (peak 

band and side band) from the sample wells. These readings are converted into 
results by the processor's Control Center. 

Snreenino Assay Performance 

2 0 1 ■ Serum/Plag;ma From Inocuiale rl Chimpanzees 

As previously described, Table I summarizes the results of a study which 
followed the course of HCV infection In seven chimpanzees using a screening assay 
which utilized the c100-3 polypeptide, and the screening assay which utilized 
pHCV-31 and pHCV-34. Both assays gave negative results before inoculation and 

2 5 both assays detected the presence of antibodies after the animal had been infected 

with HCV. However, in the comparison of the two assays, the assay utilizing pHCV- 
31 and pHC\/-34 detected seroconversion to HCV antigens at an earlier or equivalent 
bleed date in six of the seven chimpanzees. Data from these chimpanzee studies 
clearly demonstrate that overall detection of HCV antibodies is greatly increased 

3 0 with the assay utilizing the pHCV-31 and pHCV-34 proteins. This test is 

sufficiently sensitive to detect seroconversion during the acute phase of this 
disease, as defined as an elevation in ALT levels, in most animals. Equally important 
is the high degree of specificity of the test as no pre-inoculation specimens were 
reactive. 
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Non-A. Non-B Panel II m. Alter. NIH^ 

A panel of highly pedigreed human sera from Dr, H. Alter, NIH, Bethesda, 
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MD., containing infectious HCV sera, negative sera and other disease controls were 
tested. A total of 44 specimens were present in the panel. 

Six of seven sera which were "proven infectious" in chimpanzees were 
positive in both the screening assay using c100-3 as well as in the screening assay 
5 utilizing the recombinant proteins pHCV-31 and pHCV-34. These six reactive 
specimens were obtained from individuals with chronic hepatitis. All six of the 
reactive specimens were confirmed positive using synthetic peptide sp67. One 
specimen obtained during the acute phase of NANB post-transfusion hepatitis was 
non-reactive in both screening assays. 
10 In the group labeled "probable infectious** were three samples taken from 

the same post transfusion hepatitis patient. The first two acute phase samples were 
negative in both assays, but the third sample was reactive in both assay. The 
disease control samples and pedigreed negative controls were uniformly negative. 

All sixteen specimens detected as positive by both screening assays were 

1 5 confirmed by the splI7 confirmatory assay (Figure 14). In addition, specimens 10 

and 29 were newly detected in the screening assay utilizing the recombinant pHCV- 
31 and pHCV-34 antigens and were reactive by the sp75 confirmatory assay. 
Specimen 39 was initially reactive in the screening test utilizing pHCV-34 and 
pHCV-31, but upon retesting was negative and could not be confirmed by the 

2 0 confirmatory assays. 

In summary, both screening tests identified 6 of 6 chronic NANBH carriers 
and 1 of 4 acute NANBH samples. Paired specimens from an implicated donor were 
non-reactive in the screening test utilizing c100-3 but were reactive in the 
screening test with pHCV-31 and pHCV-34. Thus, the screening test utilizing the 

2 5 recombinant antigens pHCV-31 and pHCV-34 appears to be more sensitive than the 

screening assay utilizing c100-3. None of the disease control specimens or 
pedigreed negative control specimens were reactive in either screening assay. 

3. CBER Reference Panel 

3 0 A reference panel for antibody to Hepatitis C was received from the Center 

for Biologies Evaluation and Research (CBER). This 10 member panel consists of 
eight reactive samples diluted in normal human sera negative for antibody to HCV 
and two sera that contain no detectable antibody to HCV. This panel was run on the 
Ortho first generation HCV EIA assay, the screening assay utilizing c100-3 and the 
3 5 screening assay utilizing pHCV-31 and pHCV-34. The assay results are presented 
in Figure 15. ' 

The screening assay utilizing pHCV-31 and pHCV-34 detected all six of the 
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HCV positive or borderline sample dilutions. The two non-reactive sample dilutions 
(709 and 710) appear to be diluted well beyond endpoint of antibody detectabiiity 
for both screening assays. A marked increase was observed in the sample to cutoff 
values for three of the members on the screening assay utilizing pHCV-31 and 
5 pHCV-34 compared to the screening assay utilizing c100-3 or the Orlho first 
generation test. All repeatably reactive specimens were confirmed. 

pyAMPl F 4. CD NinRMATORY ASSAY 

The confirmatory assay provides a means for unequivocally identifying the 
1 0 presence of an antibody that is immunologically reactive with an HCV antigen. The 
confirmatory assay includes synthetic peptides or recombinant antigens 
representing major epitopes contained within the three distinct regions of the HCV 
genome, which are the same regions represented by the two recombinant antigens 
described in the screening assay. Recombinant proteins used in the confirmatory 

1 5 assay should have a heterologous source of antigen to that used in the primary 

screening assay (i.e. should not be an £^-derived recombinant antigen nor a 
recombinant antigen composed in part, of CKS sequences). Specimens repeatedly 
reactive in the primary screening assay are retested in the confirmatory assay. 
Aiiquots containing identical amounts of specimen are contacted with a synthetic 
20 peptide or recombinant antigen individually coated onto a polystyrene bead. 
Seroreactivity for epitopes within the c100-3 region of the HCV genome are 
confirmed by use of the synthetic peptides sp67 and sp65. The synthetic peptide 
sp117 can also be used to confirm seroreactivity with the c100-3 region. 
Seroreactivity for HCV epitopes within the putative core region of HCV are 

2 5 confirmed by the use of the synthetic peptide sp75. In order to confirm 

seroreactivity for HCV epitopes within the 33c region of HCV, a recombinant 
antigen expressed as a chimeric protein with superoxide dismutase (SOD) in* yeast 
is used. Finally, the antibody-antigen complex is detected. 

The assay protocols were similar to those described in Example 3 above. The 

3 0 peptides are each individually coated onto polystyrene beads and used in an antibody 

capture format similar to that described for the screening assay. Ten microliters of 
specimen are added to the wells of a reaction tray along with 400 ul of a specimen 
diluent and a peptide coated bead. After one hour of incubation at 40°C, the beads 
are washed and 200 ul of conjugate (identical to that described in Example 3) is 
3 5 added to the wells of the reaction tray. After one hour of incubation at 40°C, the 

beads are washed, exposed to the OPD substrate for 30 minutes at room temperature 
and the reaction terminated by the addition of 1 N H2SO4. The absorbance is read at 
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492 nm. The cutoff value for the peptide assay is 4 times the mean of the negative 
control absorbance value. 

1. Panels contain ino Specimens "At Risk" for HCV Infection. 
5 A group of 233 specimens representing 23 hemodialysis patients all with 

clinically diagnosed NANBH were supplied by Gary Gitnick, M.D. at the University of 
California, Los Angeles Center for the Health Sciences. These samples which were 
tested in by the screening assay utilizing c100-3 were subsequently tested in the 
screening assay which uses pHCV-31 and pHCV-34. A total of 7/23 patients 
1 0 (30.44%) were reactive in the c100-3 screening assay, with a total of 36 repeat 
reactive specimens. Ten of 23 patients (43.48%) were reactive by the screening 
assay utilizing pHCV-3T and pHCV-34. with a total of 70 repeatable reactives 
among the available specimens (Figure 16). Two specimens were unavailable for 
testing. All of the 36 repeatedly reactive specimens detected in the c100-3 

1 5 screening assay were confirmed by synthetic peptide confirmatory assays. A total 

of 34 of these 36 were repeatedly reactive on HCV ElA utilizing pHCV-34 and 
pHCV-31 ; two specimens were not available for testing. Of the 36 specimens 
additionally detected by the screening assay utilizing pHCV-34 and pHCV-31 , 9 
were confirmed by the core peptide confirmatory assay (sp75) and 27 were 

2 0 confirmed by the SOD-33c confirmatory assay. 

In summary these data indicate that detection of anti-HCV by the screening 
assay utilizing pHCV-31 and pHCV-34 may occur at an equivalent bleed date or as 
many as 9 months earlier, when compared to the c1 00-3 screening assay. Figure 
17 depicts earlier detection by the screening assay utilizing pHCV-34 and pHCV-31 

2 5 in a hemodialysis patient. 

5. Acute/Chronic Non-A. Non-B Hepatitis 

A population of specimens was identified from individuals diagnosed as 
having acute or chronic NANBH. Specimens from individuals with acute cases of 

3 0 NANBH were received from Gary Gitnick, M.D. at the University of California, Los 

Angeles Center for Health Sciences. The diagnosis of acute hepatitis was based on the 
presence of a cytolytic syndrome (ALT levels greater than 2X the upper normal 
limit) on at least 2 serum samples for a duration of less than 6 months with or 
without other biological abnormalities and clinical symptoms. All specimens were 
3 5 also negative for IgM antibodies to Hepatitis A Virus (I-IAV) and were negative for 
Hepatitis B surface Ag when tested with commercially available tests. Specimens 
from cases of chronic NANBH were obtained from two clinical sites. Individuals 
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were diagnosed as having chronic NANBH based on the following criteria: 
persistently elevated ALT levels, liver biopsy results, and/or the absence of 
detectable HBsAg. Specimens with biopsy results were further categorized as either 
chronic active NANBH, chronic persistent NANBH, or chronic NANBH with 
5 cirrhosis. 

These specimens were tested by both the c100-3 screening assay and the 
screening .assay utilizing pHCV-34 and pHCV-31. The latter testing was performed 
in replicates of two by both the Quantum and PPC methods. 
Community Anntjirg^d N ANBH ^Acute^ 
1 0 The C100-3 screening assay detected 2 of 10 specimens (20.00%) as 

repeatedly reactive, both of which were confirmed. The screening assay utilizing 
pHCV-34 and pHCV-31 detected both of these specimens plus and additional 2 
specimens (Figure 18). These 2 specimens were confirmed by sp75 (see Figure 
19). 

1 5 Acute Post-Transfusi on NANBH 

The 01 00-3 assay detected 4 of 32 specimens (12.50%) as repeatedly 
reactive, all of which was confirmed. The screening assay utilizing pHCV-34 and 
pHCV-31 detected 3 out of these 4 specimens (75%) as reactive. The one sample 
that was missed had an S/CO of 0.95 by the latter screening test. This sample was 

2 0 confirmed by the sp67 peptide (Figure 18). In addition, the screening assay 

utilizing pHCV-34 and pHCV-31 detected 11 specimens not reactive in the c100-3 
screening assay. Of the 9 specimens available for confirmation. 8 were confirmed 
by sp75 and 1 could not be confirmed but had an S/CO of 0.90 in the sp65 
confirmatory test, (see Figure 19). 
25 Chronic NANBH 

A summary of the results on these populations is shown in Figure 20. 
Overall, 155 of 154 (94.5%) chronic NANBH samples were detected by the 
screening test utilizing pHCV-31 and pHCV.34 using either Quantum or PRC. The 
155 reactive samples were all confirmed in alteirnate assays using synthetic 

3 0 peptides based on sequences from either the clOO, 33c or core regions of the HCV 

genome. In contrast, only 138 of 164 (84.1%) specimens were positive by the 
clOO-3 assay. All but one of the 1 38 c1 00-3 samples were detected as positive by 
the screening assay utilizing pHCV-31 and pHCV-34. The one discordant specimen ^ 
was not confirmed by either synthetic or neutralization assays. Conversely, there 
3 5 were 17 confirmed specimens which were positive only by the screening assay 
utilizing pHCV-34 and pHCV-31. 

The results indicate that the screening assay utilizing pHCV-34 and pHCV- 
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31 is more sensitive than the current test in detecting HCV positive individuals 
within chronically infected NANBH populations. 



EXAMPLE 5. G nmpfitrtion ASSAY 
5 The recombinant polypeptides containing antigenic HCV epitopes are useful 

for competition assays. To perform a neutralization assay, a recombinant 
polypeptide representing epitopes within the c100-3 region such as CKS-BCD 
{pHCV-23) is solubilized and mixed with a sample diluent to a final concentration 
of 0.5-50 ug/ml. Ten microliters of specimen or diluted specimen is added to a 
1 0 reaction well followed by 400 ul of the sample diluent containing the recombinant 
polypeptide and if desired, the mixture may be preincubated for about fifteen 
minutes to two hours. A bead coated with clOO-3 antigen is then added to the 
reaction well and incubated for one hour at 40^*0. After washing, 200 ul of a 
peroxidase labeled goat anti-human IgG in conjugate diluent is added and incubated 

1 5 for one hour at 40*^0. After washing. OPD substrate is added and incubated at room 

temperature for thirty minutes. The reaction is terminated by the addition of 1 N 
sulfuric acid and the absorbance read at 432 nm. 

Samples containing antibodies to the c1 00-3 antigen generate a reduced 
signal caused by the competitive binding of the peptides to these antibodies in 

2 0 solution. The percentage of competitive binding may be calculated by comparing the 

absorbance value of the sample in the presence of a recombinant polypeptide to the 
absorbance value of the sample assayed in the absence of a recombinant polypeptide 
at the same dilution. 

2 5 EXAMPLE 6. IMMUNODOT ASSAY 

The immunodot assay system uses a panel of purified recombinant 
polypeptides placed in an array on a nitrocellulose solid support. The prepared 
solid support is contacted with a sample and captures specific antibodies to HCV 
antigens. The captured antibodies are detected by a conjugate-specific reaction. 

3 0 Preferably, the conjugate specific reaction is quantified using a reflectance optics . 

assembly within an Instrument which has been described in U.S. Patent Applications 
Serial No. 07/227.408 filed August 2, 1988. The related U.S. Patent Applications 
Serial Nos. 07/227,272, 07/227,586 and 07/227,590 further describe specific 
methods and apparatus useful to perform an immunodot assay. The assay has also 
3 5 been described in U.S. Application Serial No. 07/532,489 filed June 6, 1990. 
Briefly, a nitrocellulose-base test cartridge is treated with multiple antigenic 
polypeptides. Each polypeptide is contained within a specific reaction zone on the 
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test cartridge. After all the antigenic polypeptides have been placed on the 
nitrocellulose, excess binding sites on the nitroceiluiose are blocked. The test 
cartridge is then contacted with a sample such that each antigenic polypeptide in 
each reaction zone will react if the sample contains the appropriate antibody. Atter 
5 reaction, the test cartridge is washed and any antigen-antibody reactions are 
identified using suitable well known reagents. 

As described in the patent applications listed above, the entire process is 
amenable to automation. The specifications of these applications related to the 
method and apparatus for performing an immunodot assay are incorporated by 

1 0 reference herein. 

In a preferred immunodot assay, the recombinant polypeptides pHCV-23, 
pHCV-29. pHCV-34, and clOO-3 were diluted in the preferred buffers, pH 
conditions, and spotting concentrations as summarized in Figure 21 and applied to a 
preassembled nitrocellulose test cartridge. After drying the cartridge overnight at 

1 5 room temperature syC, the non-specific binding capacity of the nitro-celluiose 

phase was blocked. The blocking solution contained 1% porcine gelatin, 1% casein 
enzymatic hydrolysate, 5% Tween-20. 0.1% sodium azide. 0.5 M sodium chloride 
and 20 mM Tris. pH 7.5. 

Forty normal donors were assayed by following the method described above. 

2 0 The mean reflectance density value then was determined for each of the recombinant 

proteins. A cutoff value was calculated as the negative mean plus six standard 
deviations. Test cartridges were incubated with samples A00642 and 423 (see 
Figure 22). Sample A00642 was from a convalescent non-A, non-B hepatitis 
patient, diluted in negative human plasma from 1:100 to 1:12800. The other 

2 5 sample, 423. was from a paid plasma donor which tested positive in an assay using 

a recombinant c100-3 polypeptide, diluted in negative human plasma from 1:40 to 
1:2560. After sample incubation, sequential incubations with, a biotin-conjugated 
goat anti-human immunoglobulin-specific antibody, an alkaline phosphatase- 
conjugated rabbit anti-biotin specific antibody, and 5-bromo-4-chloro-3-indoiy[ 

3 0 phosphate produced a colored product at the site of the reaction. Sample to cutoff 

values (S/CO) were determined for all HCV recombinant proteins. Those S/CO 
values greater than or equal to 1.0 were considered reactive. The limiting dilution 
was defined as the lowest dilution at which the S/CO was greater than or equal to 
1 .0. As seen in Figure 22, each sample tested positive for all HCV recombinant 
3 5 proteins. The data demonstrate that reactivity for sample A00542 was greatest 
with pHCV-29, and decreased for the remaining antigens pHCV-23, c100-3, and 
pHCV-34. Sample 423 most strongly reacted with the recombinant proteins 



BNSDOCID: <'WO 93040BBA1 ...L> 



^ wo 93/04088 



PCr/US92/07188 

27 



expressing pHCV-29 and pHCV-34, and to a lesser extent with pHCV-23 and c100- 
3. 

EXAMPLE 7. HCV CKS-NS5 EXPRESSION VECTORS 
5 A. Preparation of HCV CKS-NS5E 

Eight individual oligonucleotides representing amino acids 1932-2191 of 
the HCV genopDe were ligated together and cloned as a 793 base pair EcoRI-BamHI 
fragment into the CKS fusion vector pJ0200. The resulting plasmid. designated 
pHCV-45 (SEQ.ID.NO 8), expresses the HCV CKS-NS5E antigen under control of the 
10 lac promoter. The HCV CKS-NS5E antigen consists of 239 amino acids of CKS, nine 
amino acids contributed by linker DNA sequences, and 260 amino acids from the 
HCV NS4/NS5 region (amino acids 1932-2191). Figure 23 presents a schematic 
representation of the recombinant antigen expressed by pHCV-45, SEQ.ID.NO. 10 
and 1 1 presents the DNA and amino acid sequence of the HCV CKS-NS5E recombinant 

1 5 antigen produced by pHCV-45. Figure 24 presents the expression of pHCV-45 

proteins in E.coli . Lane 1 contained the E.coli lysate containing pHCV-45 
expressing the HCV CKS-NS5E antigen (amino acids 1932-2191) prior to 
induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These 
results show that the pHCV-45 fusion protein has an apparent mobility 

2 0 corresponding to a molecular size of 55,000 daltons. This compares acceptably to 

the predicted molecular mass of 57,597 daltons. 

B. Preparation of HCV CKS-NS5F 

Eleven individual oligonucleotides representing amino acids 2188-2481 of 

2 5 the HCV genome were ligated together and cloned as a 895 base pair EcoRi-BamHI 

fragment into the CKS fusion vector pJ0200. The resulting plasmid. designated 
pHCV-48 , expresses the HCV CKS-NS5F antigen under control of the iac promoter. 
The HCV CKS-NS5F antigen consists of 239 amino acids of CKS, eight amino acids 
contributed by linker DNA sequences, and 294 amino acids from the HCV NS5 region 

3 0 (amino acids 2188-2481). Figure 25 presents a schematic representation of the 

recombinant antigen expressed by pHCV-48. SEQ.ID.NO. 12 and 13 presents the 
DNA and amino acid sequence of the HCV CKS-NS5F recombinant antigen produced by 
pHCV-4a. Figure 26 presents the expression of pHCV-48. proteins in E.coli . Lane 
1 contained the E.coli lysate containing pHCV-48 expressing the HCV CKS-NS5F 
3 5 antigen (amino acids 2188-2481) prior to induction and lanes 2 and 3 after 2 and 
4 hours post induction, respectively. These results show that the pHCV-48 fusion 
protein has an apparent mobility corresponding to a molecular size of 65,000 
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daitons. This compares acceptably to the predicted molecular mass of 58,985 
daitons. . *^ 

C. Preparation of HCV CKS-NS5G 

5 Seven individual oligonucleotides representing amino acids 2480-2729 of 

the HCV genome were ligated together and cloned as a 769 base pair EcoRI-BamHI 
fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated 
pHCV-51 (SEQ.ID.no. 10), expresses the HCV CKS-NS5G antigen under control of 
the lac promoter. The HCV CKS-NS5G antigen consists of 239 amino acids of CKS, 
1 0 eight amino acids contributed by linker DNA sequences, and 250 amino acids from 
the HCV NS5 region (amino acids 2480-2729). Figure 27 presents a schematic 
representation of the recombinant antigen expressed by pHCV-51. SEQ.NO.ID N0.14 
and 15 presents the DNA and amino acid sequence of the HCV CKS-NS5G recombinant 
antigen produced by pHCV-51. Figure 28 presents the expression of pHCV-51 

1 5 proteins in E.coli . Lane 1 contained the E.coli lysate containing pHCV-51 

expressing the HCV CKS-NS5G antigen (amino acids 2480-2729) prior to 
induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These 
results show that the pHCV-51 fusion protein has an apparent mobHity 
corresponding to a molecular size of 55,000 daitons. This compares acceptably to 

2 0 the predicted molecular mass of 54,720 daitons. 

D. Preoaratinn of HCV CKS-NS5H 

Six individual oligonucleotides representing amino acids 2728-2867 of the 
HCV genome were ligated together and, cloned as a 439 base pair EcoRl-BamHI 

2 5 fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated 

pHCV-50 (SEQ.NO.ID.il) expresses the HCV CKS-NS5H antigen under control of 
the lac promoter. The HCV CKS-NS5H antigen consists of 239 amino acids of CKS, 
eight amino acids contributed by linker DNA sequences, and 140 amino acids from 
the HCV NS5 region (amino acids 2728-2867). Figure 29 presents a schematic 

3 0 representation of the recombinant antigen expressed by pHCV-50. SEQ.ID.NO. 16 

and 17 presents the DNA and amirio acid sequence of the HCV CKS-NS5H recombinant 
antigen produced by pHCV-5G. Figure 30 presents the expression of pHCV-50 
proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-50 
expressing the HCV CKS-NS5H antigen (amino acids 2728-2867) prior to 
35 induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These 
results show that the pHCV-5D fusion protein has an apparent mobility 
corresponding to a molecular size of 45,000 daitons. This compares acceptably to 
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the predicted molecular mass of 42,783 daltons. 
E.. Preparation of HCV CKS-NS5I 

Six individual oiigonucleotides representing amino acids 2866-3011 of tine 
HCV genome were iigated together and cloned as a 460 base pair EcoRI-BamHI 
5 fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated 
pHCV-49 (SEQ.NO.ID.NO. 12), expresses the HCV CKS-NS5i antigen under control 
of the lac promoter. The HCV CKS-NS5I antigen consists of 239 amino acids of CKS, 
eight amino acids contributed by linker DNA sequences, and 146 amino acids from 
the HCV NS5 region (amino acids 2866-3011). Figure 31 presents a schematic 
1 0 representation of the recombinant antigen expressed by pHCV-49. SEQJD.NO. 18 

and 19 presents the DNA and amino acid sequence of the HCV CKS-NS5I recombinant 
antigen produced by pHCV-49. Figure 32 presents the expression of pHCV-49 
proteins in E.coli. Lane 1 contained the E.coii lysate containing pHCV-49 
expressing HCV CKS-NS5I antigen (amino acids 2866-3011) prior to induction 

1 5 and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These results 

show that the pHCV-49 fusion protein has an apparent mobility corresponding to a 
molecular size of 42,000 daltons. This compares acceptably to the predicted 
molecular mass of 43,497 daltons. 

2 0 F. ImmunoblQt of HCV CKS-NS5 Antioens 

Induced E.coli lysates containing pHCV-23. pHCV-45, pHCV-48,, pHCV-51, 
pHCV-50, or pHCV-49 were individually run on preparative SDS/PAGE gels to 
separate the various HCV CKS-NS5 or HCV CKS-BCD recombinant antigens assay 
from the majority of other E^coH proteins. Gel slices containing the separated 

2 5 individual HCV CKS-NS5 or HCV CKS-BCD recombinant antigens were then 

eiectrophereticaily transfen-ed to nitrocellulose, and the nitrocellulose sheet cut 
into strips. Figure 40 presents the results of a Western Blot analysis of various 
serum or plasma samples using these nitrocellulose strips. The arrows on the right 
indicate the position of each HCV CKS-BCD or HCV CKS-NS5 recombinant antigen, 

3 0 from top to bottom pHCV-23 (HCV CKS-BCD), pHCV-45 (HCV CKS-NS5E), pHCV- 

48 (HCV CKS-NS5F), pHCV-51 (HCV CKS-NS5G), pHCV-50 (HCV CKS-NS5H), 
pHCV-49 (HCV CKS-NS5I), and pJO200 (CKS). Panel A contained five normal 
human plasma, panel B contained five normal human sera, panel C contained twenty 
human sera positive in the Abbott HCV ElA test, panel D contained two mouse sera 
3 5 directed against CKS, and panel E contained two normal mouse sera. Both the HCV 
CKS-NS5E antigen expressed by pHCV-45 and the HCV CKS-NS5F antigen 
expressed by pHCV-48 were immunoreactive when screened with human serum 



BNSDOCID: <:WO__.93040BBA1.J .> 



wo 93/04088 



PCT/US92/07188 



30 

samples containing HCV antibodies. 

EXAMPLE 8. HCVCKS-C100 
A. Preparation of HCV CKS-C100 Vectors 
5 Eighteen individual oligonucleotides representing amino acids 1569-1931 

of the HCV genome were ligated together and cloned as four separate EcoRI-BamHI 
subfragnnents into the CKS fusion vector pJ0200. After subsequent DNA sequences 
confirmation, the four subfragments were digested with the appropriate restriction 
enzymes, gel purified, ligated together, and cloned as an 1102 base pair EcoRI- 
1 0 BamHI fragment in the CKS fusion vector pJ0200. The resulting plasmid, 

designated pHCV-24, expresses the HCV CKS-C100 antigen under control of the jac 
promoter. The HCV CKS-clOO antigen consists of 239 amino acids of CKS, eight 
amino acids contributed by linker DNA sequences, 363 amino acids from the HCV 
NS4 region (amino acids 1559-1931) and 10 additional amino acids contributed 

1 5 by linker DNA sequences. The HCV CKS-clOO antigen was expressed at very iow 

levels by pHCV-24. 

Poor expression levels of this HCV CKS-clOO recombinant antigen were 
overcome by constructing two additional clones containing deletions in the extreme 
amino terminal portion of the HCV clOO region. The first of these clones, 

2 0 designated pHCV-57 (SEQ.ID.NO. 20 and 21), contains a 23 amino acid deletion 

(HCV amino acids 1575-1597) and was constructed by deleting a 69 base pair Ddel 
restriction fragment. The second of these clones, designated pHCV-58 (SEQ.ID.NO. 
22 and 23), contains a 21 amino acid deletion (HCV amino acids 1600-1620) and 
was constructed by deleting a 63 base pair NIafV-Haelll restriction fragment. 

2 5 Figure 34 presents a schematic representation of the recombinant antigens 

expressed by pHCV-24. pHCV-57. and pHCV-58. SEQJD. NO. 13 presents the DNA 
and amino acid sequence of the HCV-C100D1 recombinant antigen produced by 
pHCV-57. SEQ.ID.NO. 14 presents the DNA and amino acid sequence of the HCV- 
C100D2 recombinant antigen produced by pHCV-58. Figure 35 presents the 

3 0 expression of pHCV-24, pHCV-57, and pHCV-58 proteins in E.coli. Lane 1 

contained the E.coli iysate containing pHCV-24 expressing the HCV CKS-clOO 
antigen (amino acids 1569-1931) prior to induction and lanes 2 and 3 after 2 and 
4 hours post induction, respectively. Lane 4 contained the E.coli Iysate containing 
pHCV-57 expressing the HCV-CKS-C100D1 antigen (amino acids 1569-1574 and 
3 5 1598-1931) prior to induction and lanes 5 and 6 after 2 and 4 hours induction, 
respectively. Lane 7 contained the E.coli Iysate containing pHCV-58 expressing the 
HCV CKS-C100D2 antigen (amino acids 1569-1599 and 1621-1931) prior to 
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induction, and lanes 8 and 9 after 2 and 4 hours induction, respectively. These 
results show that both the pHCV-57 and pHCV-58 fusion proteins express at 
significantly higher levels than the pHCV-24 fusion protein and that both the 
pHCV-57 and pHCV-58 fusion proteins have an apparent mobility corresponding to 
5 a molecular size of 65,000 daltons. This compares acceptably to the predicted 

molecular mass of 64,450 daltons for pHCV-57 and 64.458 daltons for pHCV-58. 

EXAMPLE 9. HCV PGR DERIVED EXPRESSION VECTORS 
A. Preparation of HCV DNiA Fragments 
1 0 RNA was extracted from the serum of various chimpanzees or humans 

infected with HCV by first subjecting the samples to digestion with Proteinase K and 
SDS for 1 hour at 37° centigrade followed by numerous phenol:chloroform 
extractions. The RNA was then concentrated by several ethanol precipitations and 
resuspended in water. RNA samples were then reverse transcribed according to 

1 5 supplier's instructions using a specific primer. A second primer was then added 

and PGR amplification was performed according to supplier's instructions. An 
aliquot of this PGR reaction was then subjected to an additional round of PGR using 
nested primers located internal to the first set of primers. In general, these 
primers also contained restriction endonuclease recognition sequences to be used for 

2 0 subsequent cloning. An aliquot of this second round nested PGR reaction was then 

subjected to agarose gel electrophoresis and Southern blot analysis to confirm the 
specificity of the PGR reaction. The remainder of the PGR reaction was then 
digested with the appropriate restriction enzymes, the HCV DNA fragment of 
interest gel purified, and ligated to an appropriate cloning vector. This ligation was 

2 5 then transformed into E-coli and single colonies were isolated and piasmid DNA 

prepared for DNA sequences analysis. The DNA sequences was then evaluated to 
confirm that the specific HCV coding region of interest was intact HCV DNA 
fragments obtained in this manner were then cloned into appropriate vectors for 
expression analysis. 

3 0 B. Preparation of HCV CKS-NS3 

Using the methods detailed above, a 474 base pair DNA fragment from the 
putative NS3 region of HCV was generated by PGR. This fragment represents HCV 
amino acids #1473-1629 and was cloned into the GKS expression vector pJG201 
by blunt-end ligation. The resulting clone, designated pHGV-105, expresses the 
3 5 HCV CKS-NS3 antigen under control of the iac promoter. The HCV CKS-NS3 antigen 
consists of 239 amino acids of GKS, 12 amino acids contributed by linker DNA 
sequences, 157 amino acids from the HCV NS3 region (amino acids 1473-1629), 
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and 9 additional amino acids contributed by linker DNA sequences. Figure 36 
presents a schematic representation of the pHCV-105 antigen. SEQ.ID.NO. 24 and 
25 presents the DNA and amino acid sequence of the HCV CKS-NS3 recombinant 
antigen produced by pHCV-105, Figure 37 presents the expression of pHCV-105 
5 proteins in E.cofi. Lane 1 contained the E.coli iysate containing pHCV-105 

expressing the HCV CKS-NS3 antigen (amino acids 1472-1629) prior to induction 
and lanes 2 and 3 after 2 and 4 hours induction, respectively. These results show 
that the pHCV-105 fusion protein has an apparent mobility corresponding to a 
molecular mass of 43,000 daltons. This compares acceptably to the predicted 
1 0 molecular mass of 46,454 daltons. 
C. Preparation of HCV CKS>5'ENV 

Using the methods detailed above, a 489 base pair DNA fragment from the 
putative envelope region of HCV was generated by PGR. This fragment represents 
the HCV amino acids 114-276 and was cloned into the CKS expression vector 

1 5 pJ0202 using EcoRI-BamHI restriction sites. The resulting clone, designated 

pHCV-103 (SEQ.ID.NO. 26 and 27). expresses the HCV CKS-5'ENV antigen under 
control of the iac promoter. The HCV CKS-5'ENV antigen consists of 239 amino 
acids of CKS, 7 amino acids contributed by linker DNA sequences, 163 amino acids 
from the HCV envelope region (amino acids 114-275), and 16 additional amino 

2 0 acids contributed by linker DNA sequences. Figure 38 presents a schematic 

representation of the pHCV-103 antigen. SEQ.ID.NO. 26 and 27 presents tfie DNA 
and amino acid sequence of the HCV CKS-5'ENV recombinant antigen produced by 
pHCV-103. Figure 37 presents the expression of pHCV-103 proteins in E.coli. 
Lane 1 contained the E.coli Iysate containing pHCV-103 expressing the HCV CKS- 

2 5 5'ENV antigen (amino acids 114-276) prior to induction and lanes 5 and 6 after 2 

and 4 hours induction, respectively. These results show that the pHCV-103 fusion 
protein has an apparent mobility corresponding to a molecular mass of 47,000 
daltons. This compares acceptably to the predicted molecular mass of 46,091 
daltons. 

3 0 D. Preoaration of HCV CKS-3'ENV 

Using the methods detailed above, a 621 base pair DNA fragment form the 
putative envelope region of HCV was generated by PCR. This fragment represents 
HCV amino acids 263-469 and was cloned into the CKS expression vector pJ0202 
using EcoRI restriction sites. The resulting clone, designated pHCV-101 
3 5 (SEQ.ID.NO. 17), expresses the HCV CKS-3'ENV antigen under control of the lac 
promoter. The HCV CKS-3'ENV antigen consists of 239 amino acids of CKS, 7 
amino acids contributed by linker DNA sequences, 207 amino acids from the HCV 
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envelope region (amino acids 263-469), and 15 additional annino acids contributed 
by linker DNA sequences. . Figure 39 presents a schematic representation of the 
pHCV-101 antigen. SEQ.ID.NO. 28 and 29 presents the DNA and amino acid sequence 
of the HCV CKS-3'ENV recombinant antigen produced by pHCV-101. Figure 37 
5 presents the expression of pHCV-101 proteins in E.coli Lane 7 contained the E.coli 
lysate containing pHCV-101 expressing the HCV CKS-3'ENV antigen (amino acids 
263-469) prior to induction and lanes 8 and 9 after 2 and 4 hours induction, 
respectively. These resulting show that the pHCV-101 fusion protein has an 
apparent mobility corresponding to a molecular mass of 47,000 daltons. This 
1 0 compares acceptably to the predicted molecular mass of 51,181 daftons. 

E. Preparation of HCV CKS-NS2 

Using the methods detailed above, a 636 base pair DNA fragment from the 
putative NS2 region of HCV was generated by PGR. This fragment represents the 
HCV amino acids 994-1205 and was cloned into the CKS expression vector pJ0201 

1 5 using EcoRI restriction sites. The resulting clone, designated pHCV-102. expresses 

the HCV CKS-NS2 antigen under control of the lac promoter. The HCV CKS-NS2 
antigen consists of 239 amino acids of CKS, 7 amino acids contributed by linker 
DNA sequences. 212 amino acids from the HCV NS2 region (amino acids 994- 
:^ 1205), and 16 additional amino acids contributed by linker DNA sequences. Figure 

2 0 40 presents a schematic representation of the pHCV-102 antigen. SEQ.ID.NO. 30 

and 31 presents the DNA and amino acid sequence of the HCV CKS-NS2 recombinant 
antigen produced by pHCV-102. Figure 41 presents the expression of pHCV-102 
proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-102 
expressing the HCV CKS-NS2 antigen (amino acids 994-1205) prior to induction 

2 5 and lanes 2 and 3 after 2 and 4 hours induction, respectively. These results show 

that the pHCV-102 fusion protein has an apparent mobility corresponding to a 
molecular mass of 53,000 daltons. This compares acceptably to the predicted 
molecular mass of 51,213 daltons. 

F. Preparation of HCV CKS-NS1 

3 0 Using the methods detailed above, a 654 base pair DNA fragment from the 

putative NS1 region of HCV was generated by PCR. This fragment represents HCV 
amino acids 617-834 and was cloned into the CKS expression vector pJ0200 using 
EcoRI-BamHl restriction sites. The resulting clone, designated pHCV-107, 
expresses the HCV CKS-NS1 antigen under control of the lac promoter. The HCV 
3 5 CKS-NS1 antigen consists of 239 amino acids of CKS, 10 amino acids contributed by 
linker DNA sequences, and 218 amino acids from the HCV NS1 region (amino acids 
617-834). Figure 42 presents a schematic representation of the pHCV-107 
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antigen. SEQ.ID.NO. 32 and 33 presents The DNA and amino acid sequence of the HCV 
CKS-NS1 recombinant antigen produced by pHCV-107. 
G. Preparation of HCV CKS-ENV 

Using the methods detailed above, a 10B8 base pair DNA fragment from the 
5 putative envelope region of HCV was generated by PCR. This fragment represents 
HCV amino acids #114-469 and was cloned into the CKS expression vector pJ0202 
using EcoRI restriction sites. The resulting clone, designated pHCV-104, expresses 
the HCV CKS-ENV antigen under control of the jac promoter. The HCV CKS-ENV 
antigen consists of 239 amino acids of CKS, 7 amino acids contributed by linker 
1 0 DNA sequences. 356 amino acids from the HCV envelope region (amino acids114- 
469). and 15. additional amino acids contributed by linker DNA sequences. Figure 
43 presents a schematic representation of the pHCV-104 antigen. SEQ.ID.NO. 34 
and 35 presents the DNA and amino acid sequence of the HCV CKS-ENV recombinant 
antigen produced by pHCV-104. 

1 5 

P^AMPLE 1Q. HCV CKS-NS1S1 
A. Construction of the HCV CKS-NS1S1 Ex pression Vector 

Eight individual oligonucleotides representing amino acids 365-579 of the 
HCV genome were iigated together and cloned as a 645 base pair EcoRl/BamHI 

2 0 fragment into the CKS fusion vector pJO200. The amino acid sequence of this 

antigen is designated as pHCV-77 (SEQ. ID. NO. 1). The resuh:aat fusion protein 
HCV CKS-NS1S1 consists of 239 amino acids of CKS, seven amino acids contributed 
by linked DNA sequences, and 215 amino acids from the NSI region of the HCV 
genome. 

2 5 B. Production and Characterization of the Recombinant Antio en HCV-NSISI 

pHCV-77 was transformed into E.coli K-12 strain XL-1 (recA1, endAI, 
gyrA96. thi-1, hsdR17, SupE44. relAI, lac/fl, plOAB. lacl1ADM15. TN10) 
cells. Expression analysis and characterization of the recombinant protein was done 
using poiyacrylamide gel electrophoresis as described in Example 1. The apparent 

3 0 molecular weight of the pHCV-77 antigen was the same as the expected molecular 

weight of 50,228 as visualized on a coumassie stained gel. The immunoreactivity as 
determined by Western blot analysis using human sera indicated that this 
recombinant antigen was indeed immunoreactive. FIGURE 47A presents the 
expression of pHCV-77 in E. coli . FIGURE 478 presents an immunoblot of the 
3 5 pHCV-77 antigen expressed in E. coli : Lane 1 contained the E. coli lysate containing 
pHCV-77 expressing the HCV CKS-NS1S1 antigen prior to induction and Lanes 2 
and 3 are 2 and 4 hours post-induction, respectfully. 
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EXAMPLE 1 1 ■ HCV CKS-NS1S2 

A. Construction of the HCV CKS-NS1S2 Expression Vector 

Six individual oligonucleotides representing amino acids 565-731 of the 
5 HCV genome was iigated together and cloned as a 501 base pair EcoRI/BamHI 

fragment into the CKS fusion vector pJO200. The complete amino acid sequence of 
this antigen is designated as pHCV-65 (SEQ. ID. NO. 2). The resultant fusion 
protein HCV CKS-NS1S2 consists of 239 amino acids of CKS, eight amino acids 
contributed by linker DNA sequences, and 167 amino acids from the NS1 region of 

1 0 the HCV genome. 

B. Production and Characterization of the Recombinant An tioen HCV-NS1S2 

pHCV-65 was transformed into E.coli K-12 strain XL-1 (recA1. endA1. 
gyrA96, thi-1, hsdR17, SupE44. relAI, lac/fl, plOAB, laclqAMDIS, TNIO) ceils. 
Expression analysis and characterization of the recombinant protein was done using 

1 5 polyacrylamide gel electrophoresis as described in Example 1 . The apparent 

molecular weight of the pHCV-65 antigen was the same as the expected molecular 
weight of 46,223 as visualized on a coumassie stained gel. The immunoreactivity as 
determined by Westem blot analyis using human sera indicated that this 
recombinant antigen was indeed immunoreactive. FIGURE 48A presents the 

2 0 expression of pHCV-65 in E. coli . FIGURE 48B presents an immunoblot of the 

pHCV-65 antigen expressed in E. coli . Lane 1 contained the E. coil lysate containing 
pHCV-65 expressing the HCV CKS-NS1S2 antigen prior to induction and Lanes 2 
and 3 are 2 and 4 hours post-induction, respectively. 

2 5 EXAMPLE 12. CKS-NS1S3 

A. Construction of the HCV CKS-NS1S3 Expression Vector 

Six individual oligonucleotides representing amino acids 717-847 of the 
HCV genome were Iigated together and cloned as a 393 base pair EcoRl/BamHI 
fragment into the CKS fusion vector pJO200. The complete amino acid sequence of 

3 0 this antigen is designated as pHCV-78 (SEQ. ID. NO. 3). The resultant fusion 

protein HCV CKS-NS1S3 consists of 239 amino acids of CKS, eight amino acids 
contributed by linker DNA sequences, and 131 amino acids from the NS1 region of 
the HCV genome. 

B. Production and Characterization of the Recombiant A ntioen HCV-NS1S3 

3 5 pHCV-78 was transformed into E.coii K-12 strain XL-1 (recAl, endA1, 

gyrA96, thi-1, hsdR17, SupE44, relAI, Iac/f1. plOAB. laclqADMIS. TNIO) ceils.. 
Expression analysis and characterization of the recombinant protein was done using 
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polyacryl amide gel electrophoresis as described in Example 1. Analysis of the 
coumassie stained gel indicated very low levels of expression of the protein with an 
expected molecular weight of 42,1 141. Western blot analysis also failed to show 
any immunore activity and we are continuing to identify human sera that is specific 
5 to this region of NS1. 

FXAMPLE 13. CKS-MS1S1-NS1S2 

A. Constnjction of the HCV CKS-NS1S1-N1S1S2 Expression Vector 

The construction of pHCV-80 (NS1S1-NS1S2) involved using the 
1 0 SACl/BamHl insert from pHCV-65 and iigating that into the Sacl/BamHI vector 

backbone of pHCV-77. The resultant HCV gene represents amino acids 365-731 of 
the HCV genome. This resulted in a 1101 base pair EcoRI/BamHI fragment of HCV 
cloned into the CKS fusion vector pJO200. The complete amino acid sequence of this 
antigen is designated as pHCV-80 (SEQ. ID. NO. 4). The resultant fusion protein 

1 5 HCV CKS NS1 SI -NS1 S2 consists of 239 amino acids of CKS, seven amino acids 

contributed by linker DNA sequences, and 367 amino acids from the NS1 region of 
the HCV genome. 

B. Production and Characterization of the Recombinant Antioen HCV-NS1S1-NS1S2 

pHCV-80 was transformed unto E.coli K-12 strain XL-1 (recA1, endAI. 

2 0 gyrA96, thi-1, hsdR17, SupE44, relAI. lac/fl. plOAB, laclqADM15. TN10) cells. 

Expression analysis and characterization of the recombinant protein was done using 
^' poiyacrylamide gel electrophoresis as described in Example 1 . The apparent 

molecular weight of the pHCV-80 antigen was the same as the expected molecular 
weight of 68,454 as visualized on a coumassie stained gel. The immunoreactivity as 

2 5 determined by Western blot analysis using human sera indicated that this 

recombinant antigen was very immunoreactive. FIGURE 49A presents the 
expression of pHCV-80 in E. coli . FIGURE 49B presents an immunobiot of pHCV- 
80 antigen expressed in E. coli . Lane 1 contained the E. coli tysate containing pHCV- 
80 expressing the HCV CKS-NS1S1-NS1S2 antigen prior to induction and Lanes 2 

3 0 and 3 are 2 and 4 hours post-induction, respectively. 

FXAMPLE 14. HCV CKS-FULL LENGTH NS1 
A. Construction of the HCV CKS-full length NS1 E xpression Vector 

The construction of pHCV-92 (SEQ. ID. NO. 5) full length NS1) involved 
3 5 using the Xhol/BamHl insert from pHCV-78 (SEQ. ID, NO. 3) and Iigating that into 
the Xhol/BamHl vector backbone of pHCV-80 (SEQ. ID. NO. 4). The resultant HCV 
gene represents amino acids 365-847 of the HCV genome. This resulted in a 1449 
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base pair EcoRl/BamHl fragment of HCV cloned into CKS fusion vector pJO200. The 
connplete amino acid sequence of this antigen is designated as pHCV-92 (SEQ. ID. NO. 
5). The resultant fusion protein HCV CKS-fuil length NS1 consists of 239 amino 
acids of CKS, seven amino acids contributed by linker DNA sequences, and 483 
5 amino acids from the NS.1 region of the HCV genome. 

B. Production and Characterization of the Recombinant Antigen dHCV-92 

pHCV-92 was transformed into E.coli K-12 strain XL-1 (recAl, endA1, 
gyrA96, thi-1. hsdRl7, SupE44, relAI, lac/f1. plOAB. laclqADM15, TN10) cells. 
Expression analysis and characterization of the recombinant protein was done using 
1 0 polyacryiameide gel electrophoresis as described in Example 1 . The expression 
levels as seen by counassie stained gel were virtually undectable and the Western 
blot indicated no immunoreactivity. We are still in the process of identifying sera 
that will recognize this region of HCV NS1. 

1 5 The present invention thus provides unique recombinant antigens 

representing distinct antigenic regions of the HCV genome which can be used as 
reagents for the detection and/or confirmation of antibodies and antigens in test 
samples from individuals exposed to HCV. The NS1 protein is considered to be a 
non-structural membraxie glycoprotein and to be able to elicrt a protective immune 

2 0 response of the host against lethal viral infection. 

The recombinant antigens, either alone or in combination, can be used in the 
. assay formats provided herein and exemplified in the Examples. It also is 
contemplated that these recombinant antigens can be used to develop specific 
inhibitors of viral replication and used for therapeutic purposes, such as for 
2 5 vaccines. Other applications and modifications of the use of these antigens and the 
specific embodiments of this inventions as set forth herein, will be apparent to 
those skilled in the art Accordingly, the invention is intended to be limited only in 
accordance with the appended claims. 
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SEQUENCE USTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: DEVARE, S. 
DESAl, S. 
DAILEY, S. 

(ii) TITLE OF INVENTION: HCV SYhTTHETIC PEPTIDE FROM NSI REGION 
(Hi) NUMBER OF SEQUENCES: 35 

(w) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ABBOTT LABORATORIES 

(B) STREET: ONE ABBOTT PARK ROAD 

(C) CITY: ABBOTT PARK 

(D) STATE: ILUNOIS 

(E) COUNTRY: U.S. 

(F) ZIP: 60065-3500 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Palentln Release #1.0, Version #1.25 

(vi) CURRENT APPUCATiON DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: POREMBSKl, PRISCILLA E. 
(B) REGISTRATION NUMBER: 33,207 
(C) REFERENCE/DOCKET NUMBER: 4834PG.02 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 708-937-6365 

(B) TELEFAX: 708-937-9556 



(2) INFORMATION FOR SEQ ID NO:1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 463 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
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1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Giy Lys Pro Met lie Val His 
20 25 30 

Val Leu Qlu Arg Ala Arg Giu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Giy Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Giu Val Val Giu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 1 05 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 . 240 

Asp Pro Ser Thr Asn Ser Thr Met Val Gly Asn Trp Ala Lys Val Leu 
245 250 255 

Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr. His Vai Thr 
260 255 270 

Gly Gly Ser Ala Gly His Thr Val Ser Gly Phe Val Ser Leu Leu Ala 
275 280 285 

Pro Gly Ala Lys Gin Asn Val Gin Leu lie Asn Thr Asn Gly Ser Trp 
290 295 300 
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His Leu Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn Thr Gly 
305 310 315 320 

Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser Ser Gly Cys 
325 330 335 

Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp Gin Gly 
340 345 350 

Trp Gly Gin lie Ser Tyr Ala Asn Gly Ser Gly Pro Asp Gin Arg Pro 
355 360 365 

Tyr Cys Trp His Tyr Pro Pro Lys Pro Cys Gly lie Val Pro Ala Lys 
370 375 380 

Ser Val Cys Gly Pro Vai Tyr Cys Phe Thr Pro Ser Pro Val Val Val 
385 390 395 400 

Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Glu Asn 
405 410 415 

Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 
420 425 430 

Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 
435 440 445 

Gly Ala Pro Pro Cys Val lie Gly Gly Ala Gly Asn Asn Thr Leu 
" 450 455 460 



(2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 amino acids . 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iO MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N02: 

Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 
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Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Va! Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Giy Asp 
165 1 70 1 75 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Met Gly Ala Pro Pro Cys Val lie Gly Gly 
245 250 255 

Ala Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Phe Arg Lys His 
260 265 270 

Pro Asp Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp He Thr Pro 
275 280 285 

Arg Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Thr Pro Cys Thr 
290 295 300 

lie Asn Thr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val Glu 
305 310 315 320 

His Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp 
325 330 335 

Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr 
340 345 350 
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Thr Gin Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Lsu Pro Ala Leu 
355 360 365 

Ser Thr Gly Leu lie His Leu Gly Gin Asn lie Val Asp Val Gin Tyr 
370 375 380 

Leu Tyr Gly Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys Trp Glu 
385 390 395 400 

Tyr Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 
405 410 



(2) INFORMATION FOR SEQ ID N03: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iO MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N03: 

Met Ser Phe Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu Pro 
1.5 10 15 

Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His Val 
20 25 30 

Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala Thr 

35 ■■ 40 45 :i 

Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu Val 
50 55 60 

Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala Glu 
65 70 75 80 

Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn Val 
85 90 95 

Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val Ala 
100 105 110 

Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val Pro 
115 120 125 

lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val Val 
130 135 140 

Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He Pro 
145 150 155 160 
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Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp Asn 
165 170 175 

Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie Arg 
180 185 190 

Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met Leu 
195 200 205 

Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala Val 
210 215 220 



Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu Asp 
225 230 235 240 



Pro Ser Thr Asn Ser Thr Met Glu Tyr Val Val Leu Leu Phe Leu Leu 
245 250 255 

Leu Ala Asp Ala Arg Val Cys Ser Cys Leu Tip Met Met Leu Leu lie 
260 265 270 

Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu Val lie Leu Asn Ala Ala 
275 280 285 

Ser Leu Ala Gly Thr His Gly Leu Val Ser Phe Leu Val Phe Phe Cys 
290 295 300 

Phe Ala Trp Tyr Leu Lys Gly Lys Trp Val Pro Gly Ala Val Tyr Thr 
305 310 315 320 

Phe Tyr Gly Met Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Gin 
325 330 335 

Arg Ala Tyr Ala Leu Asp Thr Glu Val Ala Ala Ser Cys Gly Gly Val 
340 345 350 

Val Leu Val Gly Leu Met Ala Leu Thr Leu Ser Pro Tyr Tyr Lys Arg 
355 ■ 360 365 



Tyr lie Ser Trp Cys Leu Trp Trp Leu Gin 
370 375 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
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Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Aia Ser Tiir Arg Leu 
1 5 10 15 

Pro Giy Lys Pro Leu Vai Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Giu Ser Gly Aia Glu Arg lie He Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Aia Val Glu Ala Aia Gly Gly Glu 
50 55 . 60 

Vai Cys Met Thr Arg Aia Asp His Gin Ser Gly Thr Glu Arg Leu Aia 
55 70 75 80 

Glu Val Vai Giu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Giu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Aia Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 
115 120 125 

Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Aia Giy Phe lie 
180 1B5 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Giy Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Giu Val Pro Gly Thr Gly Vai Asp Thr Pro Glu Asp Leu 
230 235 240 
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Asp Pro Ser Thr Asn Ser Thr Met Vai Gly Asn Trp Aia Lys Val Leu 
245 250 



Val Val Leu Leu Leu Phe Aia Gly Val Asp Aia Giu Thr His Val Thr 
260 265 270 

Gly Gly Ser Ala Gly His Thr Vai Ser Giy Phe Val Ser Leu Leu Aia 
275 280 285 

Pro Gly Ala Lys Gin Asn Val Gin Leu lie Asn Thr Asn Gly Ser Trp 
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290 



295 



45 
300 



His Leu Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn Thr Gly 
305 310 315 320 

Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser Ser Gly Cys 
325 330 335 

Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp Gin Gly 
340 345 350 

Trp Gly Gin lie Ser Tyr Ala Asn Gly Set Giy Pro Asp Gin Arg Pro 
355 360 365 

Tyr Cys Trp His Tyr Pro Pro Lys Pro Cys Gly lie Val Pro Ala Lys 
370 375 380 

Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 
385 390 395 400 

Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Glu Asn 
405 410 415 

Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 
420 425 430 

Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 
435 440 445 

Gly Ala Pro Pro Cys Val lie Gly Pro Pro Cys Val lie Gly Gly Ala 
450 455 460 

Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Phe Arg Lys His Pro 
465 470 475 480 

. Asp Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro. Trp He Thr Pro Arg 
485 490 495 

Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr tie 
500 505 510 

Asn Tyr Thr lie Phe Lys lie Arg Mel Tyr Val Gly Gly Val Glu His 
515 520 



Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu 
530 535 540 

Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr Thr 
545 550 555 560 



Gin Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser 
565 570 575 

Thr Gly Leu lie His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu 
580 585 590 
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Tyr Gly Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys Trp Giu Tyr 
595 600 605 



Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Xaa 
610 615 620 



(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 738 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Giu Ser Gly Aia Glu Arg He He Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Giu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 
115 120 125 

Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Aia Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr He 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Aia Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Aia Gly Phe He 
180 185 190 
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Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His We Qiu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Va[ Ala Gin Glu Val Pro Gly Thr Giy Val Asp Thr Pro Glu Asp Leu 
230 235 240 



Asp Pro Ser Thr Asn Ser Thr Met Val Gly Asn Trp Ala Lys Val Leu 
245 250 255 

Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr His Val Thr 
260 265 270 

Gly Gly Ser Ala Gly His Thr Val Ser Gly Phe Val Ser Leu Leu Ala 
275 280 285 

Pro Gly Ala Lys Gin Asn Val Gin Leu lie Asn Thr Asn Giy Ser Trp 
290 295 300 

His Leu Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn Thr Gly 
305 310 315 320 

Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser Ser Gly Cys 
325 330 335 

Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp Gin Gly 
340 345 350 

Trp Gly Gin lie Ser Tyr Ala Asn Giy Ser Giy Pro Asp Gin Arg Pro 
355 360 365 

Tyr Cys Trp His Tyr Pro Pro Lys Pro Cys Gly lie Val Pro Ala Lys 
370 375 380 

Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 
385 390 395 400 

Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Giy Giu Asn 
405 410 415 

Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 
420 425 430 

Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Giy Phe Thr Lys Val Cys 
435 440 445 

Gly Ala Pro Pro Cys Val lie Gly Pro Pro Cys Val lie Gly Gly Ala 
450 455 460 

Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Phe Arg Lys His Pro 
465 470 475 480 
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Asp Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp lie Thr Pro Arg 
485 . 490 495 

Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr lie 
500 505 510 

Asn Tyr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val Glu His 
515 520 525 

Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu 
530 535 540 

Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr Thr 
545 550 555 560 

Gin Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser 
565 570 575 

Thr Gly Leu lie His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu 
580 585 590 

Tyr Gly Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys Trp Glu Tyr 
595 600 605 

Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Cys 
610 B15 620 

Leu Trp Met Met Leu Leu lie Ser Gin Ala Glu Ala Ala Leu Glu Asn 
525 630 635 640 l- 

Leu Val lie Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val 
645 650 655 

Ser Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Lys Trp 
660 655 670 

Val Pro Gly Ala Val Tyr Thr Phe Tyr Gly Met Trp Pro Leu Leu Leu 
675 680 685 

Leu Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Val 
690 695 700 

Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr 
705 710 715 720 

Leu Ser Pro Tyr Tyr Lys Arg Tyr lie Ser Trp Cys Leu Trp Trp Leu 
725 730 735 

Gin Xaa 
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(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4481 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 130..1317 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

GAaTTAATTCCCATTAATGTGAGTTAGCTCACTCATrAGGCACODC^^ 60 

ATGTTCCGGCTCGTATTITGTGTGGWTGTGAGCGGATAACAATTGQGCATCCAGTA^ 120 

GAGGTrTAAATGAGrTTrrGTGGTCATTATTCCCGCGCGCTACGCGTCG 168 
Mel Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser 
1 5 10 

ACGCGTCTGCCCGGTAAACCATTGGTTGATATTAACGGCAAACCCATG 216 
Thr Arg Leu Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met 
15 20 25 

ATTGTTCATGTTCTTGAACGCGCGCGTGAATC^GGTGCCGAGCGCATC 264 
lie Val Kis Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie 
30 35 40 45 

ATCGTGGCAACCGATCATGAGGATGTTGCCCGCGCCGTTGAAGCCGCT 312 
lie Val Ala Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala 
50 55 60 

GGCGGTGAAGTATGTATGACGCGCGCCGATCATCAGTGAGGAACAGAA 360 
Gly Gly Glu Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu 
65 70 . 75 

CGTCTGGCGGAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGTG 408 
Arg Leu Ala Glu Val Val Glu Lys Cys Ala Phe Sisr Asp Asp Thr Val 
80 85 90 

ATC GTT AAT GTG GAG GGT GAT GAA CCG ATG ATC OCT GCG AGA ATC ATT 456 
lie Val Asn Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie 
95 100 105 

CGT GAG GTT GCT GAT AAC CTC GCT GAG CGT CAG GTG GOT ATG GCG ACT 504 
Arg Gin Val Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr 
110 115 120 125 

CTGGCGGTGCCAATCCACAATGCGGAAGAAGCGTTTAACCCGAATGCG 552 
Leu Ala Val Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala 
130 135 140 

GTG AAA GTG GTT CTC GAC GCT GAA GGG TAT GCA CTG TAC TTC TCT CGC 600 
Val Lys Val Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg 
145 150 155 
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GCC ACC ATT OCT TGG GAT CGT GAT CGT TTT GCA GAA GGC CTT GAA ACC 648 
Ala Thr lie Pro Tip Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr 
160 165 170 

GTT GGC GAT AAC TTC CTG CGT CAT CTT GGT ATTTAT GGC TAC CGT GCA 696 
Val Gly Asp Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala 
175 180 185 

GGC TTT ATC CGT CGT TAC GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC 744 
Gly Phe He Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His 
190 195 200 205 

ATC GAA ATG TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC 792 
lie Glu Met Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie 
210 21 5 220 

CAT GTT GCT GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCT 840 
His Val Ala Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro 
225 230 235 

GAAGATCTCGACCCGTCGACGAATTCCATGTCTACCAACCCGAAACCG 888 
Glu Asp Leu Asp Pro Ser Thr Asn Ser Met Ser Thr Asn Pro Lys Pro 
240 245 250 

CAGAAAAAAAACAAACGTAACACCAACCGTCGTCOSCAGGACGTT^^ 936 
Gin Lys Lys Asn Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys 
255 260 265 

TTCCCGGGTGGTGGTCAGATCGTTGGTGGTGTTTACCTGCTGCCGCGT 984 
Phe Pro Giy Gly Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg 
270 275 280 285 

CGTGGTCCGCGTCTGGGTGrrrCGTGCTACGCGTAAAACCTCTGA^ 1032 
Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg 
290 295 300 

TCTCAGCCGCGTGGGCGTCGTCAGCCGATCCCGAAAGCTCGTCGTCCG 1080 
Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg Arg Pro 
305 310 315 

GAA GGT CGT ACC TGG GCT CAG CCG GGT TAC CCG TGG CCG CTG TAC GGT 1 1 28 
Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly 
320 325 330 

AACGAAGGTTGCGGTTGGGCTGGTTGGCTGCTGTCTCCGCGTGGATCT 1176 
Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser 
335 340 345 

CGT CCG TCTTGG GGT CCG ACC GAC CCG CGT CGT CGTTCT CGT AAC C"^ 1224 
Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu 
350 355 360 365 

GGT AAA GTT ATC GAT ACC CTG ACC TGC GGTTTC GCT GAC CTG ATG GGT 1272 
Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly 
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370 375 380 

TACATACCGCTGGTTGGAGCTCCGCTGGGTGGTGCTGCTCGTGCT 1317 
Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala 
385 390 395 

TAACXXy\TGGATCCTCTAGACTGCAGGCATGCTAAGrAAGTAGATCT^ 1377 

GCTGAAATGC GCTAATTTCA CTTCACGACA CTTCAGCCAA TTTrGGGAGG AGTGTCGTAC 1437 

CGTTACGATTTTCXJTCAATTTTTCTTTTCA ACAATTGATC TCATTCAGGT GAC ATCTTTT 1497 

ATATTGGCGCTCVVTrATGAAAGCAGTAGCTTTTATG^GGGTAATCT 1557 

CGTGCCGAATTAAGCCAmACTGGGCXaAAAAACTCAGTCGTATTGAGTO 1617 

AAAGCXaG^TACGQCGTiTGTGGGCrrnrGTATGACAQ^ 1S77 

GCy\AGVSi3CTTAQaXQCCTAATC>^QCGGGCI i U II I ICG'V2GCG<\GQCTX3GATGGCCT 1737 

TnaX)ATTATGATrCTTUrCQDTTCXX33CX3GCAT^^ 1797 

TGTTOyiOGCAGGrAGATGACGACXi'^TCV^ 1857 

CCAQCCTAACTTCGATCACTGGACXX3CTGATCX3nr^^ 1917 

QCi^TOGAACGGGnTOSCATOSATTGrrAGQCGCCGC^ 1977 

ccmGcxsTCGCcisvcsy^ 2037 . 

CGCTAACGGATTCACCACTCCAAGAATTGGAGCXW\TC^ 2097 

TGCX3CAAACCAACCXJrrGQDAGAACATATCCATCGCX3J^ 2157 

GCXaQDQGATCTOaCSaCAGCGTrGGGTCCTOGCXi^^ 2217 

GnG^QGACCC03CTAGQCTQGCGGGGTTGCCnT/0"GGTTA3CA^ 2277 

Aa3CG<VGCGAAajrG^AGa3^ 2337 

AATGGrrUrrCGGmCXDGTGTTTCGTAAAGTCTGGAAAaSCGGA^ 23S7 

CAmTGrrTCCGGATCTGCATCGCAGGATGCTGCTQGCTACCCTGTGGAACAC^^ 2457 

TGTATTAACGAAGCGCTTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC^^ 2517 

GGCTGa3GCX3AGCGGTATCAGCTCACTCVV\AGGa3CT 2577 

GGGaJAACXBCAGGAAAGA'VDATGrrGAGCAAAAGGCX^N^^ 2637 

AGGCXODGnTGCTGGCGmTTCCATAGGCTCXX3CXXXXCTG^^ 2697 

GAa3CTCV\AGTCy^G^GGTGGaSWODa3^ 2757 

CTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCXaACCCTGCCGCmCXX^ 2817 
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CCnrrCTCCCTTCGGGAAGCGTGGCGCmCTCMTGCTCACGCrrGTAGGTATCTCAGTT 2S77 

CG3rGTAGG^CG^rCQCTCCA^GCTQGGCTGTGr^GCi^^ 2937 

GCTGa3Cxr^^A^m3CTA^cTATCG^^CTTGAGTCx:A^\^ 2997 

CACTGGCiflGCAQGCAjrGGTAACAGGfi.mGGAG^QCXSAGCT^^ 3057 

AGnCTTG^AGrrGGTGGCCTA^CTACGGCTACACTAGAAGGACAGTATTTGGrrATCT^ 3117 

CJTCTGCTGAA GCXiAGTrACC TTCX3GAAAAA GAGTTGGTAG CTCTTGATCX: GGCAAACAAA 3177 

CXDACCXBCTGGTAGCGGTGGr I I I I I ICall TGCAAGCAGCAGATTACGCX3CAGAAAAAAAG 3237 

GATCTCAAGAAGATCCmGATCriTrTCTAaSGGGTCTGACXBCTCAG^ 3297 

CACGrrAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCX^C^ 3357 

ATTAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 3417 

ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG TCTATTTCGTTCATCCATAG 3477 

TTGOTGACTCXXa3GrKX3TGTA3ATAACTACXiATACQGGA^^ 3537 

GTGCTGCAATGATACCGCGAGACXCACXBCTCACCGGCJTTXAGAT^ 3597 

AGCCA3CCGGAAQGGCCGAGCX3C'\GAAOTGGTCCTGCAACm 3657 

CTATTAATTGTTGCCGGGAi\GCTAGAGTAAGTAGTTCX3CCAGTTAATAGTT^ 3717 

TTGTTGCCATTGCTACAGGCATCGrGGTGTCACGCTOBTCGrm 3777 

GCTDDGGTTCCXWVCGATCAAGGCGAGmCATGATCC^ 3837 

TTAGCrCCriTCX3GTCCTa:GA7a3TTGTCAGAAGrAAGTTGGCXX3CAGTG™ 3897 

TGGTTATGGC AGCACTGCAT AATTCTCTTA CTGTCATGCC ATCCGTrAAGATGCTTTTCTG 3957 

TGACTGGTG^^GTACTCAACC/WJTCiATTUrGAGAATAGTGTATGC^ 4017 

CTTGCXXX3GC GTCAACACGG GATAATACXOS CX3CCACATAG CAGAACTTTA AAAGTGCTCA 4077 

TCATTGGV^AACGTTCTTCGGGGCGAAAACTCTCAAGGATCnrTACCGCTGTTGAG^^ 4137 

GrrCGATCTA ACCCACTCGT GCACCCAACT GATCTTCAGC ATCnTTACTTTCACCAGCG 41 97 

TnUrGGGTGAGCAAAAACAGGAAGGCAAAATGCCQCAAAAAAGGGAATAAGGGCXBACA^ 4257 

GGAAATGTTG AATACTCATACTCTrCCTTTTTCAATATTATTGAAGCATTTATCAGGGTT 4317 

ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA ATAGGGGTTC 4377 

GGCGCACATT TCCCCGAAAA GTGCGACCTG ACGTCTMGA AACCATTATT ATCATGAGAT 4437 



930408BA1 I 



' wo 93/04088 PCr/US92/07188 

53 

TAACCTATAA AAATAGGCGT ATCACGAGGC CCTTTCGTCT TCAA 4481 

(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

Met Ser Phe Val Val lie lie Pro Aia Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Aia Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Giy Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
1 00 1 05 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Vat Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 . 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 
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Vai Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Mel Ser Thr Asn Pro Lys Pro Gin Lys Lys 
245 250 255 

Asn Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly 
260 265 270 

Gly Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
275 280 285 

Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro 
290 295 300 

Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg Arg Pro Glu Gly Arg 
305 310 315 320 

Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
325 330 335 

Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser 
340 345 350 

Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys Val 
355 360 365 

lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro 
370 375 380 

Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala 
385 390 395 



(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 5600 base pairs 
(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 130..2472 

(xO SEQUENCE DESCRIPTION: SEQ ID N0:8: 

GAATTAATTC CCATTAATGT GAGTTAGCTC ACTCATTAGG CACCCCAGGC TTTACACTTT 60 

ATGnTCCGGCTCGTATTTTGTGTGGWTGTG^GCGGATAACAATTGGGCATCCAGTAAG 120 
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GAGGTTTAAATGAGTTTTGTGGTCATTATTCCCGCGCGCTACGCGTCG 1S8 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser 
1 5 10 

ACX3CXSTCTGCCCGGTAAA(XATrGGTTGATATTAACGGCAAACCCATG 216 
Thr Arg Leu Pro Gly Lys Pro Leu Val Asp lie Asn Giy Lys Pro Met 
15 20 25 

ATTGTTCATGTTCTTGAACGCGCGCGTQAATCAGGTGCCGAGCXBCATC 264 
lie Val His Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie 
30 35 40 45 

ATCGTGGC>\ACCG^TCATGAQGATGTrGCCCGCGCCGTTGAAGCCGCT 312 
lie Val Ala Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala 
50 55 60 

GGCGGTGAAGTATGTATGACGCGCGCXJGATCATCAGTCAGQAACAGAA 360 
Gly Gly Glu Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu 
65 70 75 

CGTCTGGCGGAAGTTGTCGAAAAATGCGCATTCAGCGACGAGACGGTG 408 
Arg Leu Ala Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val 
80 85 90 

ATC GTT AAT GTG GAG GGT GAT GAA CCX3 ATG ATC CCT GCG ACA ATC ATT 456 
lie Val Asn Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie 
95 100 105 

CXBTCAGGTTGGTGATAACCTCGCTCAGOaTCAGGTGGGrrATGGCXSACT 504 
Arg Gin Val Ala Asp Asn i^u Ala Gin Arg Gin VaJ Gly Met Ala Thr 
110 115 120 125 

CTGGCGGTGCX;AATCCACAATGCGGAfi>GAAGCGTTTAACCX:5GAATGCG 552 
Leu Ala Val Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala 
130 135 140 

GTG AAA GTG GTT CTC GAC GGT GAA GGG TAT GCA CTG TAC TTG TCT CGC 600 
Val Lys Val Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg 
145 150 155 

GCGACCATTCCTTGGGATCX3TGATCGTTTTGCAGAAGGGGTTGAAACC 648 
Ala Thr lie Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Tiir 
160 165 170 

GTT GGC GAT AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA 696 
Val Gly Asp Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala 
175 180 1 85 

QGCTTTATCCGTCGTTACGTCAACTGGGAGCCAAGTCCGTTAGAACAC 744 
Gly Phe lie Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His 
190 195 200 205 

ATC GAA ATG TTA GAG C AG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC 792 
lie Glu Met Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie 
210 ■ 215 220 
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CATGTTGCTGTTGCTCAGGAAGTTCX:TGGCACAGGTGTGGATACCCCT 840 
His Val Ala Val Aia Gin Giu Val Pro Gly Thr Gly Val Asp Thr Pro 
225 230 235 

GAAG^TCTCGACCXSGTCGACX3AATTCCATGGCTGTrGAGTTTAT^ . 888 

Glu Asp Leu Asp Pro Ser Thr Asn Ser Met Ala Val Asp Phe lie Pro 
240 245 250 

GTT GAA AAT CTC GAG ACT ACT ATG CGT TCT CCG GTT TTC ACT GAC AAC 936 
Val Gtu Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn 
255 260 265 

TCT TCT CCG CCG GTT GTT CCG CAG TCT TTC C AG GTT GCT CAC CTG CAT 984 
Ser Ser Pro Pro Val Val Pro Gin Ser Phe Gin Val Aia His Leu His 
270 275 280 285 

GCTCCGACTGGTTCTGGTAAATCTACTAAAGTTCCAGCTGCTTACGCT 1032 
Aia Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Aia 
290 295 300 

GCT CAG GGT TAC AAA GTT CTG GTT CTG AAC CCG TCT GTT GCT GCT ACT 1 080 
Ala Gin Gly Tyr Lys Vai Leu Val Leu Asn Pro Ser Val Aia Ala Thr 
305 310 315 

CTG GCT TTC GGC GCC TAC ATG TCT AA^ GCT CAC GCT ATC GAC CCG AAC 1 1 28 
Leu Gly Phe Gly Ala Tyr Met Ser Lys Aia His Gly He Asp Pro Asn 
320 325 330 

ATT CGT ACT GGT GTA CGT ACT ATC ACT ACT GGTTCT CCG ATC ACT TAC 1176 
He Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ser Pro lie Thr Tyr 
335 ; 340 345 

TCT ACT TAC GGT AAA TTC CTG GCT GAC GGT GGTTGC TCT GGT GGT GCT 1 224 
Ser Thr Tyr Gly Lys Phe Leu Aia Asp Gly Gly Cys Ser Gly Gly Ala 
350 355 360 365 

TAC GAT ATC ATC ATCTGC GAC GAA TGC CAC TCT ACT GAC GCTACTTCT 1272 
Tyr Asp lie lie lie Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser 
370 375 380 

ATC CTG GGT ATC GGT ACC GTT CTG GAC CAG GCT GAA ACT GCA GGT GCT 1 320 
lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala 
385 390 395 

CGT CTG GTT GTT CTG GCT ACT GCT ACT CCG CCG GGTTCT GTT ACT GTT 1368 
Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val 
400 405 410 

CCG CAC CCG AAC ATC GAA GAA GTT GCT CTG TCG ACT ACT GGT GAA ATC 1 41 6 
Pro His Pro Asn lie Glu Glu Val Ala Leu Ser Thr Thr Gly Glu lie 
415 420 425 

CCG TTC TAC GGT MA GCT ATC CCG CTC GAG GTT ATC AAA GGT GGT CGT 1 464 
Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val He Lys Gly Gly Arg 
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430 435 440 445 

CAC CTG ATT TTC TGC CAC TCT AAA AAA AAA TGC GAC GAA CTG GCT OCT 1 51 2 
His Leu lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala 
450 455 460 

AAG CTT GTT GCT CTG GGT ATC AAC GCT GTT GCT TAC TAC CGT GGT CTG 1 560 
Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu 
465 470 475 

GAC GTTTCTGTT ATC CCG ACT TCT GGT GAC GTTGTTGTTGTQ GCC ACT 1608 
Asp Val Ser Val lie Pro Thr Ser Gly Asp Val Val Val Val Ala Thr 
480 485 490 

GAC GCT CTG ATG ACT GGT TAC ACT GGT GAC TTC GAC TCT GTT ATC GAT 1 656 
Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe A^ Ser Val lie Asp 
495 500 505 

TGC AAC ACT TGC AAT TCG TCG ACC GGT TGC GTT GTT ATC GTT GGT CGT 1 704 
Cys Asn Thr Cys Asn Ser Ser Thr Gly Cys Val Val lie Val Gly Arg 
510 515 520 525 

GTT GTT CTG TCT GGT AAA CCG GCC ATT ATC CCG GAC CGT GAA GTT CTG 1 752 
Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu 
530 535 540 

TAC CGT GAG TTC GAC GAA ATG GAA GAA TGC TCT GAG CAC CTG CCG TAC 1 800 
Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr 
545 550 555 

ATC GAA CAG GGT ATG ATG CTG GCT GAA C AG TTC AAA C AG AAA GCT CTG 1 848 
lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu 
560 565 570 

GGT CTG CTG CAG ACC GCT TCT CGT CAG GCT GAA GTT ATC GCT CCG GCT 1 895 
Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala Pro Ala 
575 580 585 

GTT CAG ACC AAC TGG CAG AAA CTC GAG ACC TTC TGG GCT AAA CAC ATG 1 944 
Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met 
590 595 600 605 

TGG AAC TTC ATC TCT GGT ATC CAG TAC CTG GCT GGT CTG TCT ACC CTG 1 992 
Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu 
610 615 620 

CCG GGT AAC CCG GCT ATC GCA AGC TTG ATG GCTTTC ACC GCT GCT GTT 2040 
Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala Ala Val 
625 630 635 

ACC TCT CCG CTG ACC ACC TCT CAG ACC CTG CTG TTC AAC ATT CTG GGT 2088 
Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly 
640 645 650 

GGT TGG GTT GCT GCT CAG CTG GCT GCT CCG GGT GCT GCT ACC OCT TTC 21 35 
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Gly Trp Val Ala Aia Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe 
655 560 665 

GTT GGT GCT GGT CTG GCT GGT OCT GCT ATC GGT TCT GTA GGC CTG GGT 21 84 
Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly 
670 675 680 685 

AAA GTT CTG ATC GAC ATT CTG GCT GGTTAC GGT GCT GGT GTT GCT GGA 2232 
Lys Val Leu lie Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly 
690 695 700 

GCT CTG GTT GCT TTC AAA ATC ATG TCT GGT GAA GTT OCG TCT ACC GAA 2280 
Ala Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu 
705 710 715 

GAT CTG GTT AAC CTG CTG CCG GCT ATC CTG TCT CCG GGT GCT CTG GTT 2328 
Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val 
720 725 730 

GTTGGTGTTGTTTGCGCTGCTATCCTGCGTCGTCACGTTGGCCCGGGT 2376 
Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly 
735 740 745 

GAA GGT GCT GTT CAG TGG ATG AAC CGT CTG ATC GCT TTC GCTTCT CGT 2424 
Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Aia Ser Arg 
750 755 760 765 

GGTAACCACGTTTCTCCATGGGATCCTCTAGACTGCAGGCATGCT AAG 2472 
Gly Ash His Val Ser Pro Trp Asp Pro Leu Asp Cys Arg His Ala Lys 
770 775 780 

TAAGTAGATC TTGAGCGCXjTTCGCGCTGA^ ATGCGCTAATTTCACTTCAC GACACTTCAG 2532 

CCAATTTTGG GAGGAGTGTC GTACCGrTTAC GATTTTCCTC AAi I I 1 ICTT TTCAACAATT ^82 

GATCrrC ATTC AGGTGACATC TTTTATATTG GCGCTCATTA TGAAAGCAGT AGCTTTTATG 2652 

AGGGTAATCTGMTGGAACAGCn'GCXarGCCGWTAAGCX^ATTTACTGGGCGa^^ 2712 

AGTCGTATTG AGTGCGTCAATGWAAGCG GATACGGCGTTGTGGGCTTT GTATGa^CAGC 2772 

CAGGGAAACCCAATGCXiXSnTAATGGCAAGAAGCTrAGCCCGCCTAATGAGCGGGC 2832 

TnriXa^CXXXaAGGCrrGG^TGGCXrrTC^XATOTG^TTCTTC 2S32 

GCAT(3CCCQZGnQ3AQ3CCATGCTGTCCAa33AC^ 2952 

TTCAAGGATCGCTCGCGGCTCrrrACCAGCXDTA^CTTCGATC^CTGGACCGCTGCT^^ 3012 

CQGCGATTTATGCCGCCT03GCG^GCAC>^TGGVO3GGTTGGC^^ 3072 

CCCTATACCTTGTCTGCXJrCCCCXSCGTTGCGTCQC^ 3132 

CCrrGA^TGG^AGC03GCGQCACCTCGCTA^a33ATTCACCACTCXW\ 3192 
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TCWTCTTGCX3GAGAACTGTGMTGCX3CAA^CCAACCCT 3252 

Gr^xQcxvu•c^txi^QCW3CCQCAC^ 3312 

GGTGCQCyaGATCGTGCTCCTGTCXS^ 3372 

TGGnTAQCy^GMTGMTCWDCXS^TACGCZaAGCXS^^ 3432 

GTCTaXa^CCTG^GCA^CW^CATGMTGG^■CTT^^^ 3432 

GCTACCCTGTGGAACVSkCXTACATCTCTATTAACXa^^ 3612 
CTGACTCXaCTGCXBCTaSGTCGrrrCXaGCTG^ 

TAATACGGTrAfCCACAG^AtGAOGGGATAAOSC^^^ 3732 

AQCWWVCaGCCAGGAACXXSrAAAAAGGCXXaCXa^ 3732 

CCCCTO'V^GCATCACAAAAATCG^CGCTCAAC^^ 2852 

TATAAAGATA{XiAGGa3T^T(XXXXJ^GQ^AGCTC^^ 3912 

TQCXXSCnTACCGGATACCTGTmSaiTTTCTC^^ 39*72 

GCTCACX3CTOTAGGfrATCTCAGrrcGGn-GTA3^^ 4032 

AaBAACCaXJCGTTCAGCCCGACCGCTGC^ 4032 

ACCCGGTAAG/CACGOCTOTCGCXVikCTGGG^^ 4152 

CX3AQGTATGn"AGGCX3CTGCTACyiO!GTTCT 4212 

GAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCXIAGn'TAODTT 4272 

GTAQCTCTraATCCGGCAA^CAAACCACCX3CTGGTAGCX3GTGGTTTTm 4332 

AGCAGAmCGCGCAGAAAAAAAGGATGTCAAGAAGATCCmGATCmTCTACGGGGT 4392 

CTGACGCrrCAGTGGAACGAAAACTCAGGTTAAGGGATTTTGG^ 4452 

GGATCTTCAC CTAGATCXTTT TTAAATTAAA AATGAAGTTT TAAATCAATC TAAAGTATAT 4512 

ATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTG^GGCACXjrATCTCAGC^ 4572 

TCTGTCTATTTC«TTCa^TCXDATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACQATA^ 4632 

GQGA3GGCTrADCATCTGGCCCCACTQCTGCAATG^TACXDGCG^^ 4692 

CTCCAG^mATGAGCAaJAAACCAGCX^AGCaB^^ 47S2 

CAACTTTATG CGCCTGGATC CAGTCTATTA ATTGTTGCCG GGAAGCTAGA GTAAGTAGnT 4S12 

a3CG^GTrMTAG^^^^GCGCAACX3TTGT^GCCATTGCTACAGGCATa3TGGTGTCAa 4872 



BNSDOCID: <\AfO 93040eBAl.L> 



wo 93/04088 



60 



PCr/US92/07l88 



CCTCGTTTGGTATGGCTTCATTCAGCTCXXSGTrCXX^AAaBA 4932 

CCXXXIVkTGTTGrraCAAAAAAGCGGTTAGCTarnC^ 4992 

AGTTGGCCGCAGTGTTATCACTCATGGTTATGGG^GC^TGCATAATTCTCT^ 5052 

TGCXykTCCXSTAAGATGCnTTTCTGTTaACTGGTGAGTACTCAACCA^CT 5112 

AGTUTATQCXBGCXS^aXi^GrrTGCTCTTGCXJCQS^^ 5172 

ATAGCAGMCTTTAAVVGTGCTCATCATTGGAA^ACGTTCTTaBGGGC^ 5232 

GGATCTTACCGCTGrrG^GATCCAGTTCGATGn-AACCCACTC^ 5292 

CAGCATCmTACnTTCACCAGCXriTTCTGGGTG^GC^ 5352 

CAAA^AAGGGAATAAGGGa^ACACGGAAATGTTGAATACTCATACTCTTCui I 1 1 lO^AT 5*12 

ATTATTGAAG CATTTATCAG GGTTATTGTCTCATGAGCGG ATACATATTT GAATGTATTT 5472 

AGAAAAATAAACi^AATAGGGGTTCCGCGCV^CATTTaXCX3/^ 5S2 

AAGAAACCATTATTATCy^TGACATTAACCTATAAAAATAGGCXrrATCACGAGGCC^^ 5592 

GTCTTCAA 5600 

(2) INFORMATION FOR SEQ ID N05: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 781 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xO SEQUENCE DESCRIPTION: SEQ ID N05: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
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85 



90 



95 



Val Gin Gly Asp Glu Pro Met lie Pro Aia Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 1 40 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
21 0 21 5 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Met Aia Val Asp Phe lie Pro Val Glu Asn 
245 250 255 

Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro 
260 265 270 

Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr 
275 280 285 

Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 
290 295 300 

Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe 
305 310 315 320 

Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr 
325 330 335 

Gly Val Arg Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr 
340 345 350 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie 
355 360 365 



He lie Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser lie Leu Gly 
370 375 380 
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lie Giy Thr Vai Leu Asp Gin Ala Glu Thr Ala Gly Aia Arg Leu Val 
385 390 395 400 

Val Leu Ala Thr Aia Thr Pro Pro Gly Ser Val Thr Val Pro His Pro 
405 410 415 

Asn lie Giu Giu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr 
420 425 430 

Gly Lys Ala lie Pro Leu Glu Val lie Lys Gly Gly Arg His Leu lie 
435 440 445 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val 
450 455 460 

Ala Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 
465 470 475 480 

Val lie Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu 
485 490 495 

Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr 
500 505 510 

Cys Asn Ser Ser Thr Gly Cys Val Val lie Vai Gly Arg Val Val Leu 
515 520 525 

Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Vai Leu Tyr Arg GJu 
530 535 540 

Phe Asp Giu Met Giu Giu Cys Ser Gin His Leu Pro Tyr lie Glu Gin 
545 550 555 560 

Giy Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala l-eu Giy Leu Leu 
565 570 575 

Gin Thr Ala Ser Arg Gin Ala Giu Val lie Ala Pro Ala Val Gin Thr 
580 585 590 

Asn Trp Gin Lys Leu Giu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
595 600 605 

lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
610 615 620 

Pro Ala lie Ala Ser Leu Met Ala Phe Thr Aia Ala Val Thr Ser Pro 
625 630 635 640 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val 
645 650 655 

Ala Ala Gin Leu Aia Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
650 655 670 



BNSDCX;iD: -cWO 9304068Al_I_> 



wo 93/04088 



PCr/US92/ 07188 



63 



Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
675 680 685 

lie Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
690 695 - 7O0 

Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
705 710 715 720 

Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Vai Gly Val 
725 730 .;735 

Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
740 745 750 

Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His 
755 760 755 

Val Ser Pro Trp Asp Pro Leu Asp Cys Arg His Ala Lys 
770 775 780 



(2) INFORMATION FOR SEQ ID NO:1 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1548 base pairs 

(B) TYPE: nucleic acid 
(C) STf:iANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1.. 1548 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

ATG AGTTTTGTGGTC ATTATTCCCGCG CGCTACGCGTCGACGCGTCTG 48 
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCC GGTAAACCATTGGTTGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGAATCAGGrrGCCGAGCGCATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie He Val Ala 
35 40 45 

ACCGATCATG^iGGATGTTGCCCGCGCCGTTGMGCCGCTGGGGGTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTA TGT ATG ACG CGC GCC GAT CAT CAG TC A GGA ACA GAA CGT CTG GCG 240 
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Val Cys Met Thr Arg Aia Asp His Gin Ser Giy Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA AAA TGC GC A TTC AGC GAC GAC ACG GTG ATC GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATTCGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

GCTGAT AACCTC GCrCAG CGTCAG GTG GGT ATG GCG ACT CTG GCG GTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CCA ATC CAC AAT GCG GAA GAA GCG TTTA^C COB AAT GCG GTG A/^ 432 
Pro He His Asn Ala Glu Giu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTTCTCGACGCTGAAGGGTATGCACTGTACTTCTCTCGCGCCACCATT 480 
Val Leu Asp Ala Glu Giy Tyr Aia Leu Tyr Phe Ser Arg Aia Thr lie 
145 150 155 160 

CCTTGGGATCGTGATCGrTTTGCAGAAGGCCTTGAAACCGTTGGCGAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Giy Leu Glu Thr Val Gly Asp 
155 170 175 

AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GC A G GC TTT ATC 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Aia Gly Phe lie 
180 185 190 

CGT CGTTAC GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
1 95 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GCT 672 
Leu Giu Gin Leu Arg Val Leu Trp Tyr Giy Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCT GAA GAT CTC 720 
Val Ala Gin Glu Val Pro Giy Thr Gly Vai Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACG AAT TCC CCA TGG ACC CAC TAC GTT CCG G^A TCT GAC 758 
Asp Pro Ser Thr Asn Ser Pro Trp Thr His Tyr Val Pro Glu Ser Asp 
245 250 255 

GCT GCT GCT CGA GTT ACC GCT ATC CTG TCT TCT CTG ACC GTT ACC CAG 81 6 
Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr Vai Thr Gin 
260 265 270 

CTT CTG CGT CGT CTG CAC CAG TGG ATC TCT TCT GAA TGC ACC ACC CCG 864 
Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
275 280' 285 
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TGC TCT GGTTCTTGG CTG CX3T GAC ATC TGG GAC TGG ATC TGC GAA GTT 912 
Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val 
290 295 300 

CTGTCTGACTTCAAAACCTGGCTGAAAGCTAMCTGATGCCGGAGCTG 960 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
305 310 315 320 

CCG GGT ATC CCX3 TTC GTT TCTTGC GAG CGT GGTTAC AAA QGT GTTTGG 1 008 
Pro Gly lie Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
325 330 335 

GGT GTT GAC GGT ATC ATG C AC ACC CGT TGC CAC TGC GGT GCT GAA ATC 1 055 
Arg Val Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu He 
340 345 350 

ACCGGTCACGTTAAAAACGGTACCATGCGTATCGTTGGTCCGCGTAC^ 1104 
Thr Gly His Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr 
355 360 365 

TGC CGT AAC ATG TGG TCT GG C ACC TTC CCG ATC AAC GCT TAG ACC ACC 1 1 52 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr 
370 375 380 

GGT CCG TGC ACC CCG CTG CCG GCT CCG AAC TAG ACC TTC GCT CTG TGG 1 200 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
385 390 395 400 

CGT GTT TCT GCT GAA GAA TAC GTT GAA ATC CGT CAG GTT GGT GAC 1248 
Arg Val Ser Ala Glu Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe 
405 410 415 

CAC TAC GTT ACC GGT ATG ACC ACC GAC AAC CTG AA^ TGC CCG TG C C AG 1 296 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
420 425 430 

GTT CCG TCT CCG GAG TTC TTC ACC GAA CTG GAC GGT GTT CGT CTG CAC 1344 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
435 440 445 

CGTTTC GCT CCG CCG TGC AAA CCG CTG CTG CGT G^A GAA GTT TCT TTC 1392 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
450 455 460 

CGT GTT GGT CTG CAC GAA TAC CCG GTT GGTTCT CAG CTG CCG TGC GAA 1440 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
465 470 475 480 

CCG GAA CCG GAC GTT GCT GTT CTG ACC TCT ATG CTG ACC GAC CCG TCT 1 488 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
485 490 495 

CACATCACCGCTGAAGCTGCTGGTCGTCGACTGGATCCTCTAGACTGC 1536 
His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Asp Pro Leu Asp Cys 
500 505 510 
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AGG CAT GCT AAG 1 548 

Arg His Ala Lys 
515 



(2) INFORMATION FOR SEQ ID NO:11 : 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 516 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOrll : 

Met Ser Phe Val Val lie lie Pro Aia Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gfy Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Aia Val Glu Ala Aia Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Aia Leu Tyr Phe Ser Arg Ala Thr He 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Aia Gly Phe He 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 
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Leu Giu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Pro Trp Thr His Tyr Val Pro Glu Ser Asp 
245 250 255 

Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr Gin 
260 265 270 

Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
275 280 285 

Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val 
290 295 300 

Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
305 310 315 320 

Pro Gly lie Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
325 330 335 

Arg Val Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu lie 
340 345 350 

Thr Gly His Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr 
355 360 355 

Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
370 375 380 

Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
385 390 395 400 

Arg Val Ser Ala Glu Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe 
405 410 415 

His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
420 425 430 

Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
435 440 445 

Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
450 455 460 

Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
465 470 475 480 

Pro Giu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
485 490 495 
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His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Asp Pro Leu Asp Cys 
500 505 510 

Arg His Ala Lys 
515 



(2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 623 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(5) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1623 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

ATGAGTTTTGTGGTCATTATTCCCGCGCGCTACGCGTCGACGCGTCTG 48 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

CCCGGTAAACCATTGGrrGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Mel lie Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGAATCAGGrrGCCGAGCGCATCATCGrrGGGA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGTTGCCCGCGCCGTTGAAGCCQCTGGCQGrTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTA TGT ATG ACG CGC GCC GAT CAT CAG TCA GGA ACA GAA CGT CTG GCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA AAA TGC GCATTC AGC GAC GAC ACG GTG ATC GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

OCT GAT AAC CTC GCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 
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CXDA ATC CAC AAT GCG GAA GAA GCG TTT AAC CCG AATGCG GTTG AAA GTG 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTT CTC GAG GCT GAA GGG TAT GC A CTG TAG TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr lie 
145 150 155 160 

CCTTGGGATOSTGATCGTTTTGCAGAAGGCCTTGAAACCGTTGGCGAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT CAT CTT GGT ATJTAT GGC TAG CGT GCA GGCTTT ATG 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

CGT CGT TAG GTG AAC TGG GAG CCA AGT CCG TTA GAA CAC ATG GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
1 95 200 205 

TTA GAG GAG CTT CGT GTT GTG TGG TAG GGG GAA AAA ATG GAT GTT GGT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT GAG GAA GriT GGT GGC ACA GGT GTG GAT ACC CGT GAA GAT CTG 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAG CCG TGG ACG AATTCT ATG CGT CGACTG GCT CGTGGTTCT CCG GGG 768 
Asp Pro Ser Thr Asn Ser Met Arg Arg Leu Ala Arg Gly Ser Pro Pro 
245 250 255 

TCT GTT GCTTCT TGTTCT GCT TCT GAA CTG TCT GGT CCG TCT GTG AAA 816 
Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Aia Pro Ser Leu Lys 
260 265 270 

GGT ACC TGG ACG GCT AAC GAG GACTGT GGG GAG GCT GAA GTG ATG GiAA 864 
Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu 
275 280 285 

GCTAACCTGGTGTGGCGTGAGGAAATGGGTGGTAACATCACCGGTGTT 912 
Aia Asn Leu Leu Trp Arg Gin Glu Met Giy Gly Asn lie Thr Arg Val 
290 295 300 

GAA TCT GAA AAC AAA GTT GTT ATG CTG GAG TCT TTC GAG CCG CTG GTT 960 
Glu Ser Giu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val 
305 310 315 320 

GCT GAA GAA GAG GAA CGT GAG ATG TCT GTT CCG GGT GAA ATG CTG GGT 1 008 
Ala Glu Glu Asp Glu Arg Glu He Ser Vai Pro Ala Glu lie Leu Arg 
325 330 335 

AAATCTCGTCGTTTC GCT GAG GCT GTG CCG GTT TGG GCT CGT CCG GAG 1056 
Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp 
340 345 350 
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TACAACCX3GCXX3CTGGTTGAAACCTGGAAAAMCCGC^ 1104 
Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Giu Pro 
355 360 365 

CXDGGrrTGrrrCACGGrrTGCCCGCTGCXX3(XX3C^ 1152 
Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val 
370 375 380 

aXaCXXSCCGCGTAAAAAAaSTACCGTTG 1200 
Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu 
385 390 395 ^ 400 

TCT ACC GCT CTG GCT GAA CTG GCT ACC CGT TCT TTC GGT TCT TCT TCT 1 248 
Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser 
405 410 415 

ACC TCG GGT ATC ACC GGT GAG AAC ACC ACC ACC TCT TCT GAA CCG GCT 1 296 
Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala 
420 425 430 

CCG TCT GGTTGC CCG CCG GAC TCT GAC GCT GAA TCTTAC TCTTCT ATG 1344 
Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met 
435 440 445 

CCG CCG CTG G^ GGT GAA a:X3 GGT GAC CCG GAT CTG TCT GAC K 1322 
Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser 
450 455 . 460 

TGG TCT ACC GTTTCTTCT GAA GCT AAC GCT GAA GAC GTT GTTTGCTGC 1440 
Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys 
455 470 475 480 

TCT ATG TCTTAC TCTTGG ACC GGT GCT CTG GTT ACT CCG TGC GCT GCT 1488 
Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala 
485 490 495 

GAA GAA CAG AAA CTG CCG ATG AAC GCT CTG TCT AAC TCT CTG CTG CGT 1535 
Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg 
500 505 510 

CAC CAC AAC CTG GTT TAC TCT ACC ACC TCT CGT TCT GCT TGC CAG CGT 1 584 
His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg 
515 520 525 

CAG AAA AAA GTT ACC TTC GAC CGT CTG CAA GTT CTA GAC 1 623 

Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 
530 535 540 



(2) INFORMATION FOR SEQ ID NO:13: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 541 amirio acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xO SEQUENCE DESCRIPTION; SEQ ID N0:13: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Aia Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Aia Gin Arg Gin Val Gly Met Aia Thr Leu Ala Val 
115 120 125 

Pro He His Asn Ala Giu Glu Aia Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Va! Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
1 95 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ata 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
230 235 240 



Asp Pro Ser Thr Asn Ser Met Arg Arg Leu Ala Arg Gly Ser Pro Pro 
245 250 255 

Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys 
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260 



265 



270 



Ala Thr Cys Thr Aia Asn His Asp Ser Pro Asp Ala Giu Leu ile Glu 
275 280 285 

Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val 
290 295 300 

Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro. Leu Val 
305 310 315 320 

Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Aia Glu lie Leu Arg 
325 330 335 

Lys Ser Arg Arg Phe Aia Gin Ala Leu Pro Val Trp Ala Arg Pro Asp 
340 345 350 

Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro 
355 360 365 

Pro Val Val His Giy Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Vai 
370 375 380 

Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu 
385 390 395 400 

Ser Thr Aia Leu Ala Giu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser 
405 410 415 

Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala 
420 425 430 

Pro Ser Gly Cys Pro Pro Asp Ser Asp Aia Giu Ser Tyr Ser Ser Met 
435 440 445 

Pro Pro Leu Giu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser 
450 455 460 

Trp Ser Thr Val Ser Ser Glu Aia Asn Ala Glu Asp Val Val Cys Cys 
465 470 475 480 

Ser Mel Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala 
485 490 495 

Glu Giu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg 
5O0 505 510 

His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg 
515 520 525 



Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 
530 535 540 



(2) INFORMATION FOR SEQ ID NO:14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAMB^EY: CDS 
(B) LOCATION: 1.. 1488 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 

ATGAGTTTTGTGGTCATrATrCCCGCGCGCTACGCGTCGACGCGrrCTG 48 
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCCGGTAAACCATTGGTrGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTrCTTGAACGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGrGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 
35 40 45 



ACCGATCATGAGGATGTTGCCCGCGCCGTrGflAGCCGCTGGCGGrrGfiA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
• 50 55 60 

GTATGT ATG ACQ CGC GCC GAT CAT CAG TCA GGA ACA GAA CGT CTG GCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGTGATGGTTAAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

GCT GAT AAC CTC GCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CCA ATC CAC AAT GCG GAA GAA GCG TTT AAC CCG AAT GCG GTG AAA GTG 432 
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTT CTC GAC GCT GAA GGG TAT GCA CTG TAG TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 
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CCTTGGGATCGTGATCGTTTTGCAQMGGCCTTGAAACCGT^ 52S 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Giy Asp 
165 -170 175 

AAC TTC CTG CGT CAT CTT GGT ATTTAT GGC TAG CGT GCAGGCTTT ATC 575 
Asn Phe Leu Arg His Leu Gly lie Tyr Giy Tyr Arg Ala Gly Phe lie 
180 185 190 

CGT CGT TAG GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAG GGC GAA AAA ATC CAT GTT OCT 572 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 
210 215 220 

GTTGCTCAGGAAGTrrCCTGGCACAGGTGTGGATACCCCTGAAGATCT^ 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAG CCG TGG ACG AATTCTCTA GAC TCC CAC TAG CAG GAG GTT CTG AAA 768 
Asp Pro Ser Thr Asn Ser Leu Asp Ser His Tyr Gin Asp Val Leu Lys 
245 250 255 

GAA GTT AAA GCT GGT GGT TCT AAA GTT AAA GGT AAC CTG CTG TGT GTT 81 6 
Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val 
260 255 270 

GAAGMGGATGCTCTCTGAGCCCGCCGCAGTCTGCTAAATCTAAATTC 864 
Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe 
275 280 285 

GGTTAC GGT GCT AAA GAG GTT CGT TGG GAG GGT CGT AAA GCT GTT ACG 912 
Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Thr 
290 295 300 

CAC ATG AAC TGT GTT TGG AAA GAT CTG CTG GAA GAG AAC GTT ACG CCG 960 
His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn VaJ Thr Pro 
305 310 315 320 

ATC GAC ACG ACG ATG ATG GCT AAA AAC GAA GTT TTC TGG GTT CAG CCG 1 008 
lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 
325 330 335 

GAA AAA GGT GGT G GT AAA CCG GCT CGT CTG ATC GTT TTC CCG GAC CTG 1 056 
Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 
340 345 350 

GGT GTT CGT GTT TGG GAA AAA ATG GCT CTG TAG GAC GTT GTT ACG AAA 1104 
Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr Lys 
355 360 365 

CTG CCG CTG GCT GTT ATG GGT TCT TCT TAG GGT TTC CAG TAG TCT CCG 1 152 
Leu Pro Leu Ala Val Met Giy Ser Ser Tyr Gly Phe Gin Tyr Ser Pro 
370 375 380 
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GGT GAG CGT GTT G^G TTC CTG GTT GAG GCT TGG AAA TOT AAA AAA ACC 1200 
Gly Gin Arg Vai Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr 
385 390 395 400 

CCG ATG GGTTTC TCT TAG GAG ACG CGTTGC TTC GAG TGT AGG GTT ACG 1248 
Pro Met Gly Phe Ser Tyr Asp Thr Arg Gys Phe Asp Ser Thr Val Thr 
405 410 415 

GAA TCT GAG ATT CGT ACC GAA GAA GCT ATG TAC CAG TGC TGG GAG CTG 1296 
Glu Ser Asp He Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu 
420 425 430 

GACCCGCAGGCTCGTGTrGCTATGAAATGTCTGACGGAACGTGTGTAG 1344 
Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr 
435 440 445 

GTrGGTGGTCCGCTGACCAACTGTGGGGGTGAAAACTGCGGTTACGGT 1392 
Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg 
450 455 460 

GGT TGG GGT GCT TCT GGT GTT CTG ACC AGG TCT TGC GGT AAC ACC CTG 1440 
Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu 
465 470 475 480 

ACG TGC TAC ATG AAA GCT CGT GCT GCT TGC CGT GCT GCT GGT CTG CAG 1488 
Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin 
485 490 495 



(2) INFORMATION FOR SEQ ID NO:15: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 496 anjino acids 

(B) TYPE: annino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 
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GIu Vai Vai Glu Lys Cys Ala Phe Set Asp Asp Thr Val lie Vat Asn 
85 90 95 

Vai Gin Gly Asp Glu Pro Met lie Pro Ala Thr iie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Vai Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr lie 
145 150 155 150 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Leu Asp Ser His Tyr Gin Asp Val Leu Lys 
245 250 255 

Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val 
260 265 270 

Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe 
275 280 285 

Gly Tyr Gly Ala Lys Asp Val Arg Cys His Aia Arg Lys Ala Vai Thr 
290 295 300 

His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr Pro 
305 310 315 320 

lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 
325 330 335 

Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 
340 345 350 

Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr Lys 
355 360 '365 
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Leu Pro Leu Aia Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Sen Pro 
370 375 380 

Gly Gin Arg Val Glu Phe Leu Val Gin Aia Trp Lys Ser Lys Lys Thr 
385 390 395 400 

Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr 
405 410 415 

Giu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu 
420 425 430 

Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr 
435 440 445 

Va! Giy Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg 
450 455 460 

Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu 
465 470 475 480 

Thr Cys Tyr lie Lys Ala Arg Aia Ala Cys Arg Ala Ala Gly Leu Gin 
485 490 495 



(2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1161 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D) TOPOLOGY; circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1..1161 

(xO SEQUENCE DESCRIPTION: SEQ ID NO:16: 

ATG AGTTiT GTG GTC ATT ATT CCC GCG CGC TAC GCGTCG ACG CGT CTG 48 
Met Ser Phe Val Val lie lie Pro Aia Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CC C GGT AAA CCA TTG GTT GAT ATT AAC GGC AAA CC C ATG ATT GTT CAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGTTGCCCGCGCCGTTGAAGCCGCTGGCGGTG^ 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
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50 55 60 

GrrATGTATGACGCGCGCCGATCATCAGTCAGGAACAGAACGTCTGGCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

QAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGnrGATCGTTMT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 . 90 95 

GTGCAGGGTGATG^ACCGATGATCCCTGCGACAATCATTCXBTCAGGTr 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

GCTGATAACCTCGCTCAGCGTTCAGGTGGGTATGGCGACTCTGGCGGTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CX^AATCCACAATGCGG^AGAAGCGTTTAACCCGAATGCGGTGAAAGTG 432 
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTTCTGGACGCTGAAGGGTATGCACTGTACTTCTCTCXSCGCXJACGATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATCGTGATCXaTTrTGCAGAAGGCCTTGAAACCGTTGGCGAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT CAT CTT GGT ATT TAT GGG TAC CGT GCA GGC TTT ATG 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie ; 
180 185 190 

CGT CGTTAC GTC AACTGG CAG GCA AGT CCGTTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTIT GCT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GAA GTT CCT GGC AC A GGT GTG GAT ACC CCT GAA GAT CTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACG AATTGC ATG CTG CAG GAC TGC ACC ATG CTG GTTTGC 768 
Asp Pro Ser Thr Asn Cys Met Leu Gin Asp Cys Thr Met Leu Val Cys 
245 250 255 

GGT GAC GAC CTG GTT GTT ATC TGC GAA TCT GCT GGT GTT CAG GAA GAC 81 6 
Gly Asp Asp Leu Val Val lie Cys Glu Ser Ala Gly Vai Gin Glu Asp 
260 265 270 

GCT GCT TCT CTG CGT GCT TTC ACC GAA GCT ATG ACC CGTTAC TCT GCT 864 
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Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Mel Thr Arg Tyr Ser Ala 
275 280 285 

CCCCX3GGGTGAC(XGCa3Cy\G(XGG^ATACGACCTGG^ 912 
Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr 
290 . 295 300 

TCT TGC TCT TCT AAC GTT TCT GTT GCT CAC GAC GGT GCT GGT AAA CGT 960 
Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 
305 310 315 320 

GTT TAC TAG CTG ACC CGJ GAC CXiG ACC ACC CCG CTG GCT CGT GCT GCT 1 006 
Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 
325 330 335 

TGG GAA ACC GCT CGT CAC ACC CCG GTA AAC TCT TGG CTG GGT AAC ATC 1 056 
Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie 
340 345 350 

ATC ATG TTC GCT CCG ACC CTG TGG GCC CGT ATG ATC CTG ATG ACC CAC 1 1 04 
lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His 
355 360 365 

TTC TTC TCT GTr CTG ATC GCT CGT GAC CAG CTG GAA CAG GCT CTG GAC 1 1 52 
Phe Phe Ser Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp 
370 375 380 

TGG GAG ATC 1161 

Cys Glu lie 

385 



(2) IIMFORMATION FOR SEC ID NO:17: . 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xO SEQUENCE DESCRIPTION: SEQ ID NO:17: 

Mel Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu ' 
50 55 60 
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Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Aia 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Giu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Aia Gin Arg Gin Val Gly Met Ala Thr Leu Ala Vai 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr He 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Aia Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 
210 215 220 

Val Ala Gin Giu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Cys Met Leu Gin Asp Cys Thr Met Leu Val Cys 
245 250 255 

Gly Asp Asp Leu Val Val lie Cys Giu Ser Ala Gly Vai Gin Glu Asp 
260 265 270 

Ala Ala Ser Leu Arg Ala Phe Thr Glu Aia Met Thr Arg Tyr Ser Aia 
275 280 285 

Pro Pro Gly Asp Pro Pro Gin Pro Giu Tyr Asp Leu Glu Leu lie Thr 
290 295 300 

Ser Cys Ser Ser Asn Vai Ser Val Ala His Asp Gly Ala Gly Lys Arg 
305 310 315 320 

Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Aia Ala 
325 330 335 

Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie 
340 345 350 

lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His 
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355 360 365 

Phe Phe Ser Val Leu lie Ala Arg Asp Gin Leu Giu Gin Ala Leu Asp 
370 375 380 

Cys Glu lie 
385 



(2) INFORMATION FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 179 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 
(8) LOCATION: 1.. 11 79 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 

ATGAGTTrTGTGGTGATTATTCCCGCGCGCTACGCGTCGACGCGTCTG 48 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCCGGTAAACCATTGGTTGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GrrCTTGAACGCGCGCGTGAATCAGGTGCCGAGCGGATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie He Val Ala 
35 40 45 

ACCGATCi^TGAGGATGTrGCCCGCGCCGTTGAAGCCGCTGGCGGTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTATGTATGACGCGCGCCGATCATCAGTCAGGaiACAGAACGTCTGGCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA AAA TGC GCA TTC AGO GAC GAG ACG GTG ATG GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

GTG GAG GGT GAT GAA CGG ATG ATG COT GCG ACA ATG ATT GGT GAG GTT 335 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 
100 105 110 

GOT GAT AAC CTC GOT CAG GGT GAG GTG GGT ATG GCG ACT GTG GCG GTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
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115 120 125 

(XiAATCCACAATGCX3QAAGAAGCGTTTMCCCGAATGCGGTGAAAGn-G 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Va! 
130 135 140 

GTT CTC CAC GCT GAA GGG TAT GC A CTG TAG TTC TCT CGC GCC ACC ATT 4S0 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATCGTGATCGTTTTGCAGAAGGCCTTGAAACCGTTGGCGAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT CAT GTT GGT ATT TAT GGG TAG CGT GCA GGC TTT ATC 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

CGTCGTTACGTC AACTGGCAGCCAAGTCCGTTAGAACACATCGAAATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

TTA GAG GAG CTT CGT GTT CTG TGG TAG GGC GAA AAA ATC CAT GTT GCT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTTGCTCAGGAAGTTCCTGGCACAGGTGTGGATACCCCTGAAGATCTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TGG ACG AAT TOG ATG GAG ATC TAG GGT GCT TGC TAG TCT ATC 768 
Asp Pro Ser Thr Asn Ser Met Glu lie Tyr Gly Ala Cys Tyr Ser lie 
245 250 255 

GAA CCG CTG GAC CTG CCG CCG ATC ATT GAG CGT CTG GAC GGT CTG TCT 81 6 
Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly Leu Ser 
260 265 . 270 

GCT TTC TCT CTG CAC TCT TAC TGG CCG GGT GAA ATC AAC CGT GTT GCT 864 
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala 
275 280 285 

GGT TGC CTG CGT AAA CTG GGT GTT CCG CCG GTG CGT GCT TGG CGT CAC 91 2 
Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His 
290 295 300 

CGT GCT CGTTCT GTT CGT GCT CGT CTG GTG GCT CGT GGT GGC CGT GCT 960 
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly Arg Ala 
305 310 315 320 

GCT ATC TGC GGT AAA TAC CTG TTC AAC TGG GCT GTT CGT ACC AAA CTG 1008 
Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
325 330 335 

AAA CTG ACC CCG ATC GCT GCT GCT GGT CAG CTG GAC CTG TCT GGTTGG 1056 
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Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp Leu Set Gly Trp 
340 345 350 

TTC ACC GCT GGT TAG TCT GGT GGT GAC ATC TAG GAG TCT GTT TCT C AC 1 1 04 
Phe Thr Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val Ser His 
355 360 365 

GCT CGT CCG CGTTGG ATC TGG TTC TGC CTG CTG CTG CTG GCT GCT GGT 1 152 
Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly 
370 375 380 

GTT GGT ATC TAG CTG CTG CCG AAC CGT 1 1 79 

Val Gly He Tyr Leu Leu Pro Asn Arg 
385 390 



(2) INFORMATION FOR SEQ ID NO:19: 

(0 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE PfPE: protein 

(xl) SEQUENCE DESCRIPTION: SEQ ID N0:14: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 
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145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Qlu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly ile Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His Ile Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gfy Glu Lys Ile His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Met Glu lie Tyr Gly Ala Cys Tyr Ser Ile 
245 250 255 

Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly Leu Ser 
260 265 270 

Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg Val Ala 
275 280 285 

Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His 
290 295 300 

Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly Arg Ala 
305 310 315 320 

Ala ile Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 
325 330 335 

Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp Leu Ser Gly Trp 
340 345 350 

Phe Thr Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val Ser His 
355 360 365 

Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly 
370 375 380 

Val Gly lie Tyr Leu Leu Pro Asn Arg 
385 390 



(2) INFORMATION FOR SEQ ID NO:20: 

(!) SEQUENCE CHAi=lACTERiSTlCS: 

(A) LENGTH: 1791 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D) TOPOLOGY: circular 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1.. 1791 

(xO SEQUENCE DESCRIPTION: SEO ID NO:20: 

ATGAGTTrTGTGGTCATTATTCCCGCGCGCTACGCGTCGACGCGrrCTG 48 
Mel Ser Phe Val Vai lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCCGGTAAACCATTGGTrGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GnrCTTGMCGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Aia Glu Arg He lie Val Ala 
35 40 45 

ACCGATCATGAGGATGrrGCCCGCGCCGnrTGA(\GCCGCTGQCGGrGAA 1S2 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Aia Ala Gly Gly Glu 
50 55 60 

GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGAACGTCTGGCG 240 
Val Cys Mel Thr Arg Ala Asp His Gin Ser Gly Tlir Glu Arg Leu Ala 
65 70 75 80 

GAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGn'GATCGTTAAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Tiir Val lie Vai Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

OCT GAT AAC CTC GOT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 
Ala Asp Asri Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CCA ATC CAC AAT GCG GAA GAA GCG TTT AAC CCG AAT GCG GTG AAA GTG 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTT CTC GAC GCT GAA GGG TAT GCA CTG TAG TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr lie 
145 150 155 160 

CCT TGG GAT CGT GAT CGT TTT GCA GAA GGCCTT GAA ACC GTT GGC GAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Giy Asp 
165 170 175 

AAC TTC CTG CGT CAT CTT GGT ATTTAT GGC TAG CGT GCA GGC TiT ATC 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
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180 185 190 

CGT CX3T TAG GTC AAC TGG GAG CCA AGT CCG TTA GA*. CAC ATC G^A ATG 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
1 95 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAG GGC GAA AAA ATC CAT GTT GCT 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GfiA GTT CGT GGC ACA GGT GTG GAT ACC CCT GAA GAT CTG 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAG CCG TGG AGG AAT TCC ATG GAG GCT CAC TTC CTG TCT CAG GCG CCG 
Asp Pro Ser Thr Asn Ser Met Asp Ala His Phe Leu Ser Gin Ala Pro 
245 250 255 

CCG GCG TCT TGG GAT CAG ATG TGG AAA TGG CTG ATC CGT CTG AAA CCG 
Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro 
260 265 270 

ACC CTG CAC GGC CCG ACC CCG CTG CTG TAG CGT CTG GGT GCT GTTCAG 
Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 
275 280 285 

AAC GAA ATC ACC CTG ACC CAC CCG GTT ACC AAA TAG ATC ATG ACC TGG 
Asn Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys 
290 295 300 

ATG TCT GCT GAT CTA GAA GTT GTT ACC TCT ACC TGG GTT CTG GTT GGT 
Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly 
305 310 315 320 

GGT GTT CTG GCT GCT CTG GCT GCTTAC TGG CTG TGG ACC GGT TGG GTT 
Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 
325 330 335 

GTT ATC GTT GGT CGT GTT GTT GTG TCT GGT AAA CCG GCG ATT ATC CCG 
Val lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro 
340 345 350 

GAC CGT GAA GTT CTG TAG CGT GAG TTC GAG GAA ATG GAA GAA TGG TCT 
Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser 
355 360 365 

CAG CAG CTG CCG TAG ATC GAA GAG GGT ATG ATG CTG GCT GAA CAG TTC 
Gin His Leu Pro Tyr lie Glu Gin Gly Met Mel Leu Ala Glu Gin Phe 
370 375 380 . 

AAA CAG AAA GCT CTG GGT CTG CTG CAG ACC GCTTGT CGT GAG GCT GAA 
Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu 
385 390 395 400 

GTT ATC GCT CCG GCT GTT CAG ACC AAC TGG CAG AAA CTG GAG ACC TTC 
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Val lie Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe 
405 410 415 

TGG OCT AAA CAC ATG TGG AAC TTC ATC TCT GGT ATC GAG TAC CTG GCT 1 296 
Tip Ala Lys His Mel Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala 
420 425 430 

GGTCTGTCTA(XCrrGCCGGGTAACCXDGGCTATCGCAAGCTTGATGGCT 1344 
Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala 
435 440 445 

TTC ACC GGT GCT GTT ACC TCT CCG CTG ACC ACC TCT GAG ACC CTG CTG 1 392 
Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu 
450 455 460 

TTCAACATTCTGGGTGGTTGGGTrGCTGCTCAGCTGGCTGCTCCGGGT 1440 
Phe Asn ile Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
465 470 475 480 

GGTGCTACCGCTTTCGTTGGTGCTGGTCTGGCTGGTGCTGCTATCGGT 1488 
Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly 
485 490 495 

TCT GTA GGC CTG GGT AAA GTT CTG ATC GAC ATT CTG GGT GGTTAC GGT 1536 
Ser Val Gly Leu Gly Lys Val Leu lie Asp ile Leu Ala Gly Tyr Gly 
500 505 510 

GCTGGTGTTGCTGGAGCTCTGGTTGCTTTCAAAATCATGTGTGGTGAA 1584 
Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie Mel Ser Gly Glu 
515 520 525 

GTT COG TOT ACC GAA GAT CTG GTT AAC CTG CTG CCG GCT ATC CTG TCT 1632 
Vai Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala ile Leu Ser 
530 535 540 

CCG GGT GCT CTG GTT GTT GGT GTT GTT TGC GCT GCT ATC CTG CGTCGT 1680 
Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg 
545 550 555 560 

CAC GTT GGC CCG GGT G^A GGT GCT GTT CAG TGG ATG AAC CGT CTG ATC 1 728 
His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu Ile 
565 570 575 

GCT TTC GCT TCT CGT GGT AAC CAC GTT TCT CCA TGG GAT CCT CTA GAC 1776 
Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Trp Asp Pro Leu Asp 
580 585 590 

TGC AGG CAT GCT AAG 1 791 

Cys Arg His Ala Lys 

595 i 



(2) INFORMATION FOR SEQ ID N021 : 
(0 SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 597 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: protein 

(xj) SEQUENCE DESCRIPTION: SEQ ID NO:21 : 

Met Ser Phe Val Vai lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Vai His 
20 25 30 

Val Leu GIu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Giy Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Giy Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Vai Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 IBS 190 

Arg Arg Tyr Val Asn Trp Gin . Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Vai Asp Thr Pro 'Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Mel Asp Ala His Phe Leu Ser Gin Ala Pro 
245 250 255 
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Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro 
260 265 270 

Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 
275 280 285 

Asn Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys 
290 295 300 

Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly 
305 310 315 320 

Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 
325 330 335 

Val He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro 
340 345 350 

Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser 
355 350 365 

Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe 
370 375 380 

Lys Gin Lys Aia Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu 
385 390 395 400 

Val He Ala Pro Ala VaJ Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe 
405 410 415 

Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala 
420 425 430 

Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala 
435 440 445 

; 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu 
450 455 460 

Phe Asn lie Leu Giy Gty Trp Val Ala Ala Gin Leu Aia Ala Pro Gly 
465 470 475 480 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Aia Gly Ala Ala lie Gly 
485 490 495 

Ser Val Gly Leu Gly Lys Val Leu lie Asp He Leu Ala Gly Tyr Gly 
500 505 510 

Ala Gly Val Aia Gty Ala Leu Val Aia Phe Lys lie Met Ser Giy Glu 
515 520 525 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser 
530 535 540 
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Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg 
545 550 555 5S0 

His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie 
565 570 575 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Trp Asp Pro Leu Asp 
580 585 590 

Cys Arg His Ala Lys 
595 



(2) INFORMATION FOR SEC ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1797 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1.. 1797 

(xi) SEQUENCE DESCRIPTION: SEQ ID N022: 

ATGAGTTTTGTGGTCATTATTCCCGCGCGCTACGCGTCGACGCGTCTG 48 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCCGGTAAACCATTGGTTGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGA^TCAGGTGCCGAGCGCATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGTTGCCCGCGCCGTTGAAGCGGCTGGCGGTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTA TGHT ATG ACG CGC GCC GAT CAT CAG TCA GG^ ACA GAA CGT CTG GCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA AAA TGC GCA TTC AGC GAC GAC ACG GTG ATC GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
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100 105 110 

GCTGATMCCTCGCTCAGCGlTCAGGrrGGGTATGGCGACTCTGGCGGTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Vai Gly Met Ala Thr Leu Ala Val 
115 120 125 

CXJAATCCACAATGCXSGAAGAAGCGTTTAACCCGAATGCGGrGAAAGrrG 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTT CTC GAC GCT GAA GGG TAT GCA CTG TAC TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATOBTGATaSXTrrGCAGAAGGCCTTGAAACCGTTGGCG^T 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 1 70 1 75 

AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA GGC TTT ATC 575 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 1 85 1 90 

CGT CGT TAC GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GCT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTTGCTCAGGA<\GTTCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACG AATTCC ATG GAC GCT CAC TTC CTG TCT CAG ACC AAA 768 
Asp Pro Ser Thr Asn Ser Mel Asp Ala His Phe Leu Ser Gin Thr Lys 
245 250 255 

CAG TCT GGT GAA AAC CTT CCG TAC CTG GTT GCT TAC CAG GCT ACC GTT 81 6 
Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
260 265 270 

TGC GCT CGT GCT CAG GCC CCG ACC CCG CTG CTG TAC CGT CTG GGT GCT 864 
Cys Ala Arg Ala Gin Ala Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala 
275 280 285 

GTT CAG AAC GAA ATC ACC CTG ACC CAC CCG GTT ACC AAA TAC ATC ATG 912 
Val Gin Asn Glu lie Thr Leu Thr- His Pro Val Thr Lys Tyr lie Met 
290 295 300 

ACC TGC ATG TCT GCT GAT CTA GAA GTT GTT ACC TCT ACC TGG GTT CTG 960 
Thr Cys Met Ser Ala Asp Leu Glu Vai Val Thr Ser Thr Trp Val Leu 
305 310 315 320 

GTT GGT GGT GTT CTG GCT GGT CTG GCT GCTTAC TGC CTG TCG ACC GGT 1 008 
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Val Giy Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly 
325 330 335 

TGCGTTGTTATCGTTGGTCGTGTTGTTCTGTCTGGTAAACCGGCCATT 1056 
Cys Val Val lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie 
340 345 350 

ATD(XGGACa3TGAAGTrCTGTACCGTGAGTTCGACGAAATGGAAGAA 1104 
He Pro Asp Arg Giu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu 
355 360 365 

TGC TCT CAG CAC CTG CCG TAC ATC GAA CAG GGT ATG ATG CTG GCT GAA 1 152 
Cys Ser Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu 
370 375 380 

CAG TTC AAA CAG AAA GCT CTG GGT CTG CTG CAG ACC GCTTCT CGT CAG 1 200 
Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin 
385 390 395 400 

GCTGAAGTTATCGCTCCGGCTGTTCAGACCAACTGGCAGAAACTCGAG 1248 
Ala Glu Val lie Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu 
405 410 415 

ACC TTC TGG GCT AAA CAC ATG TGG AAC TTC ATC TCT GGT ATC CAG TAC 1 296 
Thr Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr 
420 425 430 

CTGGCTGGTCTGTCTACCCTGCCGGGTAACCCGGCTATCGCAAGC1TG 1344 
Leu Aia Gly Leu Ser Thr Leu- Pro Gly Asn Pro Ala lie Ala Ser Leu ■ 
435 . 440 445 

ATG GCT TTC ACC GCT GCT GTT ACC TCT CCG CTG ACC ACC TCT CAG ACC 1 392 
Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr 
450 455 460 

CTG CTG TTC AAC ATT CTG GGT GGT TGG GTT GCT GCT CAG CTG GCT GCT 1440 
Leu Leu Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala 
465 470 475 480 

CCGGGTGCTGCTACCGCTTTCGTTGGTGCTGGTCTGGCTGGTGCTGCT 1488 
Pro Gly Ala Ala Thr Aia Phe Val Gly Ala Gly Leu Ala Gly Ala Ala 
485 490 495 

ATC GGT TCT GTA GGC CTG GGT AAA GTT CTG ATC GAC ATT CTG GCT GGT 1536 
lie Gly Ser Val Gly Leu Gly Lys Val Leu lie Asp lie Leu Ala Gly 
500 505 510 

TAC GGT GCT GGT GTT GCT GGA GGTCTG GTT GCTTTC AAA ATC ATG TCT 1584 
Tyr Giy Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser 
515 520 525 

GGT GAA GTT CCG TCT ACC GAA GAT CTG GTT MC CTG CTG CCG GCT ATC 1 632 
Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie 
530 535 540 
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CTG TCT CCG GGT GCT CTG GTT GTT GGT GTT GTT TGC GCT GCT ATC CTG 1680 
Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala AJa lie Leu 
545 550 555 560 

ajrCGTCACGTTGGCCCGGGTGAAQGTGCTGnrCAGTQGATGAACCGT 1728 
Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg 
565 570 575 

CTG ATC GCT TTC GCT TCT CGT GGT AAC CAC GTT TCT CCA TGG GAT CCT 1776 
Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Trp Asp Pro 
580 585 590 

CTA GAC TGC AGG GAT GCT AAG 1 797 

Leu Asp Cys Arg His Ala Lys 
595 



(2) INFORMATION FOR SEQ ID NO:23: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 599 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg tie lie Val Ala 
35 40 45 

'J 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

- !t Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 

115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 
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Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Met Asp Ala His Phe Leu Ser Gin Thr Lys 
245 250 255 

Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 
260 265 270 

Cys Ala Arg Ala Gin Ala Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala 
275 280 285 

Val Gin Asn Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr lie Met 
290 295 300 

Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu 
305 310 315 320 

Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly 
325 330 335 

Cys Val Val lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie 
340 345 350 

lie Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu 
355 360 365 

Cys Ser Gin His Leu Pro Tyr lie Glu Gtn Gly Mel Met Leu Ala Glu 
370 375 380 

Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin 
385 390 395 400 

Ala Glu Val lie Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu 
405 410 415 

Thr Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr 
420 425 430 

Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu 
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435 440 445 

Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr 
450 455 460 

Leu Leu Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala 
465 470 475 480 

Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala 
485 490 495 

lie Gly Ser Val Gly Leu Gly Lys Val Leu lie Asp lie Leu Ala Gly 
500 505 510 

Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser 
515 520 525 

Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie 
530 535 540 

Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Gys Ala Ala lie Leu 
545 550 555 560 

Arg Arg Hts Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg 
565 570 575 

Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Trp Asp Pro 
580 585 590 

Leu Asp Cys Arg His Ala Lys 
595 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(I!) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1..1251 

(xO SEQUENCE DESCRIPTION: SEQ ID N024: 

ATG AGT TTT GTG GTC ATT ATT COG GCG CGC TAC GCG TCG ACG CGT CTG 48 
Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

CCC GGT AAA CCA TTG GTT GAT ATT AAC GGC AAA CCC ATG ATT GTT CAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
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20 25 30 

GnrCTTGflACGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGTGGG^ 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGnTGCCCGCGCCGTTG^AGCCGCTGGCGGTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Giu 
50 55 60 

GTATGTATGACGCX3CGCCGATCATCAGTCAGGAACAGAACGTCTGGCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGTGATCGTTAAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

GTG GAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT GGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Mel lie Pro Ala Thr lie lie Arg Gin Val 
100 1 05 110 

GCTGATAACCTCGCrCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Me\ Ala Thr Leu Ala Val 
115 120 125 

(X)AATCCACAATG03GAAGAAGCGTTrAACCCGAATGCGGrrGAAAGrrG 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTTCTCGACGCTGAAGGGTATGCACTGTACTrCTCTCGCGCCACCATT 480 
Val Leu Asp Ala Giu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATCGTGATCGTTiTGCAGAAGGCCTTGAAACCGTTGGCGAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Giu Thr Val Gly Asp 
165 170 175 

AAC TTG GTG CGT GAT CTT GGT ATT TAT GGC TAG CGT GCA GGG TIT ATG 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

CGT CGTTAG GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Mel 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GCT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCT GAA GAT CTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Giu Asp Leu 
225 230 235 240 

GAC CCG TCG ACT CGA ATT CGA GCT CGG TAG CCT G^G ACA ATC ACG CTT 768 
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Asp Pro Ser Thr Arg lie Arg Ala Arg Tyr Pro Glu Thr lie Thr Leu 
245 250 . 255 

CXJCCAGGATGCTGTCTCCCGCACCCAGCGTC^GGCAGGACT 816 
Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg 
260 265 270 

GGGAAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCTTCC 864 
Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser 
275 280 285 

GGC ATG TTC G^CTCG TCX:^ GTC CTCTGC GAG TGC TAT GAC GCG GGC TGG 912 
Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Trp 
290 295 300 

(XTTGGTATGAGCTCACACXXGCX:G^GA^3CACAGTTAGGCTACGAGC^ 950 
Pro Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 
305 310 315 320 

TAG ATG AAC ACC CCG GGA CTC CCC GTG TGC CAA GAC GAT CTT GAA TTT 1 008 
Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe 
325 330 335 

TGG GAG GGC GTC TTC ACG GGT CTC ACC CAT ATA GAC GCC CAC TTT CTA 1 056 
Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu 
340 345 350 

TCC GAG ACA AAG GAG AGT GGG GAA AAC CTT CCTTAC CTG GTA GCG TAG 1104 
Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr 
355 360 365 

CAAQCCACCGTGTGCGCTAGAGCTCAAGCCCCTCCCCCATCGTGGQAC 1152 
Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp 
370 375 380 

CAG ATG TGG AAG TGC TTG ATC CGC CTC AAG CCT ACC CTT CAT GGG CCG 1 20O 
Gin Met Trp Lys Cys l_eu lie Arg Leu Lys Pro Thr Leu His Gly Pro 
385 390 395 400 

ACCCCCCTGCTATACAGACTGGGCGGGGGATCCTCTAGACTGCAGGCA 1248 
Thr Pro Leu Leu Tyr Arg Leu Gly Gly Gly Ser Ser Arg Leu Gin Ala 
405 410 415 

TGC 1251 
Cys 



(2) INFORMATION FOR SEQ ID NO:25: 

0) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 417 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N024: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

Pro Giy Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Giy Aia Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Giu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Aia Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Aia Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie He Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Aia Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Aia Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

' Val Leu Asp Ala Glu Giy Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Giu Giy Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Aia Gly Phe He 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Giu His lie Glu Mel 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Aia 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Giy Val Asp Thr Pro Giu Asp Leu 
230 235 240 



Asp Pro Ser Thr Arg lie Arg Ala Arg Tyr Pro Glu Thr lie Thr Leu 
245 250 255 

Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg 
260 265 270 



BNSOOCtD: <\NO 930408BA1_I..> 



wo 93/04088 



PCT/US92/07188 



9 9 



Gty Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser 
275 280 285 

Gly Met Phe Asp Ser Ser Vat Leu Gys Glu Cys Tyr Asp Ala Gly Trp 
290 295 300 

Pro Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 
305 310 315 320 

Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe 
325 330 335 

Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu 
340 345 350 

Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr 
355 360 365 

Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp 
370 375 380 

Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro 
385 390 395 400 

Thr Pro Leu Leu Tyr Arg Leu Gly Gly Gly Ser Ser Arg Leu Gin Ala 
405 410 415 

Cys 



(2) INFORMATION FOR SEQ ID N026: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1275 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1..1275 

(xi) SEQUENCE DESCRIPTION: SEQ ID N026: 

f ATGAGTTTTGTGGTCATrATTCCCGCGCGCTACGCGTCGACGCGTCTG 48 

/•. Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 

1 5 . 10 15 

CCCGGTAAACCATTGGTTGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 
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Val Leu Glu Arg Ala Arg Glu Ser Gly Aia Giu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGrrrGCCCGCGCCGnrC^GCC 132 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTATGT ATG ACG CGC GCC GAT CAT GAG TCA GGA AGAGAA CX3TCTG QCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Giy Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA AAA TGC GGA TTC AGO GAC GAC ACG GTG ATC GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Vai lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 1 05 110 

OCT GAT MC CTC OCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CCAATCCACAATGCGGAAGAAGCGTTTAACCCGAATGCGGTGAAAGTG 432 
Pro lie His Asn Ala Glu Glu Aia Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTT CTC GAC GCT GAA GGG TAT GC A CTG T AC TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATCGTGATCGTTTTGCAGAAGGCCTTGAAACCGTTGGCG^T 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA GGC TTT ATC 575 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie - 
180 185 190 

CGT CGT TAC GTC AAC TGG CAG CCA AGT CCG TTA GAA C AC ATC GAA ATG 624 
Arg Arg Tyr Vai Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GCT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Vai Ala 
210 215 220 

GTT GCTCAGGAA GTT CCT GGC ACA GGT GTG GAT ACC CCT GAA GAT CTC 720 
Val Ala Gin Glu Val Pro Giy Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACT CGA ATT CGT AGG TCG C GC A^T TTG GGT AAG GTC ATC 768 
Asp Pro Ser Thr Arg lie Arg Arg Ser Arg Asn Leu Gly Lys Val lie 
245 250 255 
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G^CACX:CTCACX3TGCGGCTTCGCCGACCTCATGGGGTATATTGCGCTC 815 
Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Mel Gly Tyr lie Pro Leu 
260 265 270 

GTCGQCQCX:CCTCT^GG^GGCG3TGCCAGGQCCCTGGGC^^ 864 
Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Gly His Gly Val 
275 280 285 

CGG GTT ' CTG GAA GAC GGC GTG AAC TAT GCG ACA GGG AAT CTT CCT GGT 9 1 2 
Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 
290 295 300 

TGC TCTTTC TCT ATC TTC CTT CTG GCC CTG CTC TCTTGC CTG ACC GTG 960 
Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 
305 310 315 320 

CCC GCA TCA GCC TAG CAA GTA CGC AAC TCCTCG GGC CTTTAC CAT GTC 1008 
Pro Ala Ser Ala Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 
325 330 335 

ACCAATGATTGCGCCAACTCGAGTATTGTGTACGAGACGGCCGATGCC 1056 
Thr Asn Asp Cys Pro Asn Ser Ser lie Val Tyr Glu Thr Ala Asp Ala 
340 345 350 

ATCCTGCACACTCCGGGGTGCGTCCCTTGCGrrrCGTGAGGGCAACGCC 1104 
lie Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala 
355 360 365 

TCGAGATGTTGGGTGGCGGTGQCCCCCACAGTGGCCACCAGGGATGGA 1152 
Ser Arg Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 
370 375 380 

AAACTCCCCGCAAGGCAGCTTCGACGTCACATTGATCTGCTTGTCGGG 1200 
Lys Leu Pro Ala Thr Gin Leu Arg Arg His lie Asp Leu Leu Val Gly 
385 390 395 400 

AGC GCC ACC CTC TGT TCG GCC CTC T AC TT A AGG AGC TCG GTA CCC GGG 1 248 
Ser Ala Thr Leu Cys Ser Ala Leu Tyr Leu Arg Ser Ser Val Pro Gly 
405 410 415 

GAT CCT CTA GAC TGC AGG CAT GCT AAG 1 275 

Asp Pro Leu Asp Cys Arg His Ala Lys 
420 425 



(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 425 amino acids 
(8) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
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Met Ser Phe Val Val lie He Pro Aia Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Vai Asp He Asn Gly Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Aia 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Aia Aia Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Aia Thr Leu Ala Val 
115 120 125 

Pro He His Asn Ala Glu Glu Aia Phe Asn Pro Asn Aia Val Lys Val 
130 135 140 

Vai Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Aia Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
1 80 1 85 1 90 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
1 95 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Aia Gin Glu Vai Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Arg lie Arg Arg Ser Arg Asn Leu Gly Lys Vai lie 
245 250 255 

Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu 
260 265 270 

Val Gly Aia Pro Leu Gly Gly Ala Ala Arg Ala Leu Gly His Gly Val 
275 280 285 
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Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 
290 295 300 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 
305 310 315 320 

Pro Ala Ser Ala Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 
325 330 335 

Thr Asn Asp Cys Pro Asn Ser Ser lie Val Tyr Glu Thr Ala Asp Ala 
340 345 350 

lie Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala 
355 360 365 

Ser Arg Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 
370 375 380 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His lie Asp Leu Leu Val Gly 
385 390 395 400 

Ser Ala Thr Leu Cys Ser Ala Leu Tyr Leu Arg Ser Ser Val Pro Gly 
405 410 415 

Asp Pro Leu Asp Cys Arg His Ala Lys 
420 425 



(2) INFORMATION FOR SEQ ID NO:28: 

(?) SEQUENCE CH/iJRACTERISTICS: 

(A) LENGTH: 1401 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1.. 1401 

(xi) SEQUENCE DESCRIPTION: SEQ ID.NO:28: 

ATG AGT TTT GTG GTC ATT ATT CCC GCG CQC TAC GOG TCG ACG CGT CTG 48 
Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

CC C GGT AAA CCA TTG GTT GAT ATT AAC GGC AAA CCC ATG ATT GTT CAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 
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ACCGATCATGAGG^TGTTGCCCGCGCCGTTGAAGCCGCTGGCGGrrGaA 132 
Thr Asp His Glu Asp Val Ala Arg Ala Vai Glu Ala Ala Gly Gly Glu 
50 55 60 

GTATGrr ATG ACQ CGC GCC GAT CAT CAG TCA GGA ACA GAA CGT CTG GCX3 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAAGTT GTC GAA AAATGC GCATTC AGC GAG GAG ACQ GTG ATC GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT OCXS ACA ATC ATT CGT CAG GTT 33S 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

GCTGATAACCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CCAATCCACAATGCGGAAGAAGCGTTTAACCCGAATGCGGTGAAAGTG 432 
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTT CTC GAC GCT GAA GGG TAT GC A CTG TAC TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATCGTGATCGTTTTGCAGAAGGCCTTGAAACCGTTGGCGAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Giu Gly Leu Glu Thr Val Gly Asp 
165 1 70 1 75 

AAC TTC CTG CGT CAT CTT GGT ATTTAT GGC TAC CGT GCAGGC TTT ATC 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

CGT CGTTAC GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Giu Met 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GCT 672 
Leu Giu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCTGAA GAT CTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACT CGA ATT CTG CTT GTC GGG AGC GCC ACC CTC TGC TCG 768 
Asp Pro Ser Thr Arg lie Leu Leu Vai Gly Ser Ala Thr Leu Cys Ser 
245 250 255 

GCC CTC TAT GTG GGG GAC TTG TGC GGG TCT GTC TTT CTT GTC GGT CAA 81 6 
Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Vai Gly Gin 
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260 265 270 

CTGTTCACTTTCTCCCXXAGGCAGCACTGGACAACGCAAG^CTGCAAC 864 
Leu Phe Thr Phe Ser Pro Arg Gin His Trp Thr Thr Gin' Asp Cys Asn 
275 280 285 

TGTTCTATCTACCCCGGCCACGTAACGGGTCACCGCATGGCATGGGAT 912 
Cys Ser lie Tyr Pro Gly His Val Thr Gly His Arg Met Ala Trp Asp 
290 295 300 

ATG ATG ATG AAC TGG TCC OCT ACG ACA GCG CTG GTA GTA GCT CAG CTG 960 
Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ala Gin Leu 
305 310 315 320 

CTCAGGGTCCCGCAAGCCATCTTGQACATGATCGCTGGTGCCCACTGG 1008 
Leu Arg Val Pro Gin Ala lie Leu Asp Met lie Ala Gly Ala His Trp 
325 330 335 

GGAGTCCTAGCGGGCATAGCGTATTTCTCCATGGTGGGGAACTGGGCG 1056 
Gly Val Leu Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Trp Ala 
340 . 345 350 

AAG GTC CTG GTAGTG CTG CTG CTATTT GCC GGC GTT GAC GCG GAA ACC 1 104 
Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr 
355 360 365 

CACGTCACCGGGGGAAGTQCCGGCCACATTACGGCTGGGCTTGTTCGT 1152 
His Val Thr Gly Gly Ser Ala Gly His lie Thr Ala Gly Leu Val Arg 
370 375 380 

CTC CTT TCA CCA GGC GCC AAG CAG AAC ATC CAA CTG ATC AAC ACC AAC 1200 
Leu Leu Ser Pro Gly Ala Lys Gin Asn lie Gin Leu lie Asn Thr Asn 
385 390 395 400 

GGC AGT TGG CAC ATC AAT AGC ACG GCC TTG AAC TGC AAT GAA AGC CTT 1 248 
Gly Ser Trp His lie Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser Leu 
405 410 415 

AAC ACC GGC TGG TTA GCA GGG CTC TTC TAT CAC CAC AAA TTC AAC TCT 1296 
Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser 
420 425 430 

TCAGGCTGTCCTGAGAGGGTTGCCAGCTGCCGTCGCCTTACCGATTTT 1344 
Ser Gly Cys Pro Glu Arg Val Ala Ser Cys Arg Arg Leu Thr Asp Phe 
435 440 445 

GACCAGGGCTGGGAATTCGAGCTCGGTACCCGGGGATCCTCTAGACTG 1392 
Asp Gin Gly Trp Glu Phe Glu Leu Gly Thr Arg Gly Ser Ser Arg Leu 
450 455 460 

CAG GCA TGC 1401 

Gin Ala Cys 

465 
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(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 467 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO-^9: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Giy Lys Pro Met lie Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 
35 40 45 . 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
S5 90 95 

Var Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 ' 150 . 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 
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Asp Pro Ser Thr Arg lie Leu Leu Val Gly Ser Ala Thr Leu Cys Ser 
245 250 255 

Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly Gin 
260 265 270 

Leu Phe Thr Phe Ser Pro Arg Gin His Trp Thr Thr Gin Asp Cys Asn 
275 280 285 

Cys Ser lie Tyr Pro Gly His Val Thr Gly His Arg Met Ala Trp Asp 
290 295 300 

Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ala Gin Leu 
305 310 315 320 

Leu Arg Val Pro Gin Ala lie Leu Asp Met lie Ala Gly Ala His Trp 
325 330 335 

Gly Val Leu Ala Gly lie Ala Tyr Phe Ser Mel Val Gly Asn Trp Ala 
340 345 350 

Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr 
355 360 365 

His Val Thr Gly Gly Ser Ala Gly His lie Thr Ala Gly Leu Val Arg 
370 375 380 

Leu Leu Ser Pro Gly Ala Lys Gin Asn lie Gin Leu lie Asn Thr Asn 
385 390 395 400 

Gly Ser Trp His lie Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser Leu 
405 410 415 

Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser 
420 425 430 

Ser Gly Cys Pro Glu Arg Val Ala Ser Cys Arg Arg Leu Thr Asp Phe 
435 440 445 

Asp Gin Gly Trp Glu Phe Glu Leu Gly Thr Arg Gly Ser Ser Arg Leu 
450 455 460 

Gin Ala Cys 
465 



(2) INFORMATION FOR SEQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1422 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/^EY: CDS 

(B) LOCATION: 1..1422 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO30: 

ATGAGTTTTGTGGTCATTATTCCCGCGCGCTACGCGTCGACGCGTCTG 48 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCCGGTAAACCATTGGTTGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTTCTTGAACGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGTGGCA 144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Giu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGTTGCCCGCGCCGrrGAAGCCGCTGGCGGTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala GJy Gly Glu 
50 55 50 

GTA TGTT ATG ACG CGC GCC GAT CAT CAG TGA GGA ACA GAA CGT CTG GCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA AAA TGC GCATTC AGC GAC GAG ACG GTG ATC GTT AAT r 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG GAG GGT GAT GAA CCG ATG ATC OCT GCG ACA ATC ATT CGT CAG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

GCT GAT AACCTCGCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CCA ATC CAC AAT GCG GAA GAA GCG TTTAAC CCG AAT GCG GTG AAA GTG 432. 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 , 135 140 

GTT GTC GAC GGT GAA GGG TAT GCA CTG TAG TTC TCT CGC GCC ACC ATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

OCT TGG GAT CGT GAT CGT TTT GCA GAA GGC GTT GAA ACC GTT GGC GAT 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT GAT GTT GGT ATT TAT GGC TAG CGT GCA GGC TTT ATG 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
1 80 1 85 1 90 
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CGT CGTTAC GTC AAC TGG GAG CCA ACT CCG TTA GAA CAC ATC G^A ATG 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
195 200 205 

TTA GAG GAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GCT 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCT GAA GAT CTC 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACC GAA TTC GGT GAC ATC ATC AAC GGC TTG CCC GTC TCC 
Asp Pro Ser Thr Glu Phe Gly Asp lie lie Asn Gly Leu Pro Val Ser 
245 250 255 

GCCCGT AGG GGC CAG GAG ATA CTG CTCGGA CCA GCC GAC GGA ATG GTC 
Ala Arg Arg Gly Gin Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val 
260 265 270 

TCC AAG GGG TGG AQG TTG CTG GCG CCC ATC ACQ GCG TAC GCC CAG CAG 
Ser Lys Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin 
275 280 285 

ACA AGG GGC CTC CTA GGG TGT ATA ATC ACC AGC CTG ACT GGC CGG GAC 
Thr Arg Gly Leu Leu Gly Cys lie lie Thr Ser Leu Thr Gly Arg Asp 
290 295 300 

AAA AAC CAA GCG GAG GGT GAG GTC CAG ATT GTG TCA ACT GCT GCC CAA 
Lys Asn Gin Ala Glu Gly Glu Val Gin lie Val Ser Thr Ala Ala Gin 
305 310 315 320 

ACT TTC CTG GC A ACG TGC ATC AAT GGG GTA TGC TGG ACT GTC TAC CAT 
Thr Phe Leu Ala Thr Cys lie Asn Gly Val Cys Trp Thr Val Tyr His 
325 330 335 

GGG GCC GGA ACG AGG ACC CTC GCA TCA CCC AAG GGT CCT GTT ATC. CAG 
Gly Ala Gly Thr Arg Thr Leu Ala Ser Pro Lys Gly Pro Val lie Gin 
340 345 350 

ATG TAT ACC AAT GTA GAC CAA GAC CTT GTG GGC TGG CCC GCT CCT CAA 
Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin 
355 360 365 

GGT GCC CGC TCA TTG ACA CCC TGC ACC TGC GGC TCC TCG GAC CTT TAC 
Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 
370 375 380 

CTG GTT ACG AGG CAC GCC GAT GTC ATT CCC GTG CGC CGG CGG GGT GAT 
Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp 
385 390 395 400 

AGC AGG GGC AGC CTG CTT TCG CCC CGG CCC ATT TCT TAT TTG AAA GGC 
Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly 
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405 410 415 

TCCTCX3GGGGGrrCCGCTGTTGTGCCCCGCX3GC^ 1296 
Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly lie 
420 425 430 

TTCAGGGCX^GCGGrrGTGTACCCGTGGAGTGGCTAAGGCG 1344 
Phe Arg Ala AJa Val Cys Thr Arg Gly Val Ala Lys Aia Val Asp Phe 
435 440 445 

GTC CCC GTG GAG AAC CTC GAG ACA ACC ATG AAT TCX3 AGC TCG GTA CCC 1 392 
Val Pro Vai Glu Asn Leu Glu Thr Thr Met Asn Ser Ser Ser Val Pro 
450 455 460 

GGG GAT CCT CTA GAC TGC AGG CAT GCT AAG 1422 
Gly Asp Pro Leu Asp Cys Arg His Ala Lys 
465 470 



(2) INFORMATION FOR SEQ ID NO:31 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

00 MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31 : 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 ' 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 
20 25 30 

Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Vai Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Glu Aia Phe Asn Pro Asn Aia Val Lys Val 
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130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 
180 185 190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 
.195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Glu Phe Gly Asp lie lie Asn Gly Leu Pro Val Ser 
245 250 255 

Ala Arg Arg Gly Gin Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val 
260 265 270 

Ser Lys Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin 
275 280 285 

Thr Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp 
290 295 300 

Lys Asn Gin Ala Glu Gly Glu Val Gin lie Vai Ser Thr Ala Ala Gin 
305 310 315 320 

Thr Phe Leu Ala Thr Cys lie Asn Gly Val Cys Trp Thr Val Tyr His 
325 330 335 

* Gly Ala Gly Thr Arg Thr Leu Ala Ser Pro Lys Gly Pro Val lie Gin 
340 345 350 

Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin 
355 360 365 

Gly Ata Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 
370 375 380 

Leu Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp 
385 390 395 400 

Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly 
405 410 415 

Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly He 
420 425 430 
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Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Vai Asp Phe 
435 440 445 

Val Pro Vai Glu Asn Leu Glu Thr Thr Met Asn Ser Ser Ser Val Pro 
450 455 460 

Gly Asp Pro Leu Asp Cys Arg His Ala Lys 
465 470 



(2) INFORMATION FOR SEQ ID N032: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1 401 base pairs 

(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

00 MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1.. 1401 

(xi) SEQUENCE DESCRIPTION: SEQ ID N032: 

ATGAGTTTTGTGGTCATTATTCCCGCGCGCTACGCGTCGACGCGTCTG 48 
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

CCCGGTAMCCATTGGTTGATATTAACGGCAAACCCATGATTGTTCAT 96 
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTTCTTG'^ACGCGCGCGTGAATCAGGrGCGGAGCGCATCATCGrGGG^ .144 
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

ACCGATCATGAGGATGTTTGCCCGCGGCGrrrG^GCXJGCTGGCGGTGAA 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTA TGT ATG ACG CGC GCC GAT CAT CAG TCA GGA ACA GfiA CGT CTG GCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAA GTT GTC GAA A^^A TGC GCATTC AGC GAC GAC ACG GTG ATC GTT AAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAACCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 335 
Vai Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 
1 00 1 05 110 
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GCTGATAAC CTCGCTCAGCGTCAGGTGGGTATGACX3ACTCTGGCGGTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Giy Met Thr Thr Leu Ala Val 
115 120 125 

CCAATCCACAATGCGQAAGAAGCGTrTAACCCGAATGCGGrrGAAAGTG 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTTCTCGACGCTGAAGGGTATGCACTGTACTTCTCTCGCGCCACCATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CXrrTGGGATCGTGATCXJrTTTGCAGAAGGCCrTTGftAACX:GTTGQD^^ 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAG CGT GCA GGC TTT ATC 576 
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 
180 185 190 

CGT CGT TAC GTC AAC TGG GAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

TTA GAG GAG CTT CGT GTT CTG TGG TAG GGC GAA AAA ATC CAT GTTGCT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTTGCTCAGGAAGTTCCTGGCACAGGTGTGGATACCCCTGAAGATCTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GACCCGTCGACGAATTCCACCATGGGGCATTATCCTTGTACCATCAAC 768 
Asp Pro Ser Thr Asn Ser Thr Met Gly His Tyr Pro Cys Thr He Asn 
245 250 



TAC ACC CTG TTC AAA GTC AGG ATG TAC GTG GGA GGG GTC GAG CAC AGG 81 6 
Tyr Thr Leu Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg 
260 265 270 

CTGGAAGTTGCTTGCAACTGGACGCGGGGCG^AGGTTGTGATCTGGAC 864 
Leu Glu Val Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Asp 
275 280 285 

GAC AGG GACAGGTCCGAGCTCAGCCCGCTGCTGCTGTCC ACCACTCAG 912 
Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin 
290 295 300 

TOG CAG GTC CTT CCG TGTTCC TTC ACQ ACC TTG CCA GCC TTG ACC ACC 960 
Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Thr Thr 
305 31 0 31 5 320 

GGC CTC ATC CAC GTC CAC CAG AAC ATC GTG GAC GTG CAA TAC TTG TAC 1 008 
Gly Leu lie His Leu His Gin Asri lie Val Asp Val Gin Tyr Leu Tyr 
325 330 335 
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GGG GTG GGG TCA AGC ATT GTG TCC TGG GCC ATC AAG TGG GAG TAG GTC 1056 
Gly Val Gly Ser Ser lie Val Ser Trp A»a lie Lys Trp Glu Tyr Val 
340 345 350 

ATC CTC TTG TTT CTC CTG CTT GGA GAG GCG CGC ATC TGC TCC TGC TTG 1 104 
lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg lie Cys Ser Cys Leu 
355 . 350 365 

TGG ATG ATG TTA CTC ATA TCC CAA GCG GAG GCA GCC TTG GAA AAC CTT 1 1 52 
Trp Met Met Leu Leu lie Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu 
370 375 380 

GTG TTA CTC AAT GCG GCG TCT CTG GCC GGG ACQ CACGGT CTT GTG TCC 1200 
Val Leu Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val Ser 
385 390 395 400 

TTC CTC GTG TITTTC TGC TTT GCA TGG TAT CTG AAG GGT AAG TGG GTG 1 248 
Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Lys Trp Val 
405 410 415 

CCC GGA GTG GCC T AC GCC TTC TAC GGG ATG TGG CCT TTC CTC CTG CTC 1 295 
Pro Gly Val Ala Tyr Ala Phe Tyr Gly Met Trp Pro Phe Leu Leu Leu 
420 425 430 

CTG TTA GCG TTG CCC CAA CGG GCA TAC GCG CTG GACACGG^G ATG GCC 1344 
Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Met Ala 
435 440 445 

GCGTCGTGTGGCGGCGTTGTTCTTGTCGGGTTAATGGCGCTGACTCTG 1332 
Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr Leu 
450 455 460 

TCA CCA TAT 1401 

Ser Pro Tyr 

465 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 467 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xO SEQUENCE DESCRIPTION: SEQ ID N053: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 
20 25 30 
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Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

Val Cys Mel Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 
115 120 125 

Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 
180 185' ,190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 
210 215 220 

Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

Asp Pro Ser Thr Asn Ser Thr Met Gly His Tyr Pro Cys Thr lie Asn 
245 250 255 

Tyr Thr Leu Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg 
260 " 265 270 

Leu Glu Val Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Asp 
275 280 285 

Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin 
290 295 300 

Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Thr Thr 
305 310 315 320 

Gly Leu tie His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu Tyr 
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325 330 335 

Gly Vai Gly Ser Ser lie Val Ser Trp Ala lie Lys Trp Glu Tyr Val 
340 345 350 

lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg lie Cys Ser Cys Leu 
355 360 365 

Trp Met Met Leu Leu lie Ser Gin Aia Glu Ala Ala Leu Glu Asn Leu 
370 375 380 

Val Leu Leu Asn Ala Ala Ser Leu Aia Gly Thr Hrs Gly Leu Val Ser 
385 390 395 400 

Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Lys Trp Val 
405 410 415 

Pro Gly Val Ala Tyr Ala Phe Tyr Gly Met Trp Pro Phe Leu Leu Leu 
420 425 430 

Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Met Aia 
435 440 445 

Ala Ser Cys Gly Gly Vai Val Leu Val Gly Leu Met Ala Leu Thr Leu 
450 455 450 

Ser Pro Tyr 
465 



(2) INFORMATION FOR SEQ ID NO:34: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(B) LOCATION: 1..1851 

(xO SEQUENCE DESCRIPTION: SEQ ID N034: 

ATG AGT TTT GTG GTC ATT ATT CCC GCG CGC TAC GCG TCG ACQ CGT CTG 48 
Met Ser Phe Vai Val He lie Pro Aia Arg Tyr Ala Ser Thr Arg Leu 
1 5 10 15 

CCC GGT AAACCATTG GTT GAT ATT AAC GGC AAA CCC ATG ATT GTTCAT 96 
Pro Gly Lys Pro Leu Val Asp ile Asn Gly Lys Pro Met lie Val His 
20 25 30 

GTTCTTGA^CGCGCGCGTGAATCAGGTGCCGAGCGCATCATCGTGGCA 144 
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Val Leu Glu Arg AJa Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

ACCGATCATG^GGATGTTGCCCGCGCCGriTGAAGCCGCTGGCGGTGAA. 192 
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 
50 55 60 

GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGAACGTCTGGCG 240 
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

GAAGTTGTCGAAAAfl^TGCGCATTCAGCGACGACAC5GGTGATCGTTAAT 288 
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 
85 90 95 

GTG CAG GGT GAT GAA CCG ATG ATC CCT GCG ACA ATC ATT CGT C AG GTT 336 
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie He Arg Gin Val 
100 105 110 

GCTGATA^CCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

CX^AATCCACAATGaBGAAGAAGCGTTTAACCCGAATGCGGTGAAAGTG 432 
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

GTTCTCGACGCTGAAGGGTATGCACTGTACTTCTCTCGCGCJCACXDATT 480 
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

CCTTGGGATCX3TGATCX3TTrTGCAGAAGGCCriTGAAACCGTTGGCGflJ 528 
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAG CGT GCA GGC TTT ATC 576 
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe lie 
1 80 1 85 1 90 

CGT CGTTAC GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC ATC GAA ATG 624 
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 
195 200 205 

TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC CAT GTT GOT 672 
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCT GAA GAT CTC 720 
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 
225 230 235 240 

GAC CCG TCG ACT CGA ATT CGT AGG TCG CGC AATTTG GGT AAG CaTC ATC 768 
Asp Pro Ser Thr Arg lie Arg Arg Ser Arg Asn Leu Gly Lys Val lie 
245 250 255 
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GATACCCTCACXSTGCGGCTTCGCCG^CCTCATGGGGTACATTCCGCTC 816 
Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu 
260 265 270 

GTCGGCQCCCCTCTTGGAGGCGCTGCX)AGGGCX:CTGGCX3CATGGCGTC 864 
Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val 
275 280 285 

CGGGTTCTGG^AGACGGCGTGAACTATGCAACAGGGAACCTTCCCGGr 912 
Arg Val Leu Giu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 
290 295 300 

TGC TCT TTC TCT ATC TTC CTT CTG GCC CTG CTC TCT TGC CTG ACT GTG 960 
Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 
305 310 315 320 

CCC GCG TCATCC TAC CAA GTA CGC AAC TCC TCG GGC CTTTAT CAT GTC 1008 
Pro Ala Ser Ser Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 
325 330 335 

ACC AAT GAT TGC CCC AAC TCG AGC ATT GTG TAC GAG ACG GCC GAT ACC 1 055 
Thr Asn Asp Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Thr 
340 345 350 

ATC CTACACTCTCCG GGG TGC GTC CCTTGC GTT CGC GAG GGC AAC ACC 1104 
lie Leu His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Thr 
355 360 365 

TtXBAAATGTTGGGTGGCGGTGGCCCCCACAGTGGCCACCAGGGACGGC 1152 
Ser Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 
370 375 380 

AAA CTC CCC TCA ACG CAG CTT CGA CGTCAC ATC GAT CTG CTC GTC GGG 1200 
Lys Leu Pro Ser Thr Gin Leu Arg Arg His lie Asp Leu Leu Va! Gly 
385 390 395 400 

AGCGCC ACCCTCTGCTCG GCC CTCTATGTG GGG GACTrGTGCGGGTCT 1248 
Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser 
405 410 415 

GTC TTT CTT GTC AGT CAA CTG TTC ACC TTC TCC CCT AGG CGC CATTGG 1 296 
Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp 
420 425 430 

ACA ACG CAA GAC TGC AAC TGT TCT ATC TAC CCC GGC CAT ATA ACQ GGT 1 344 
Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His lie Thr Gly 
435 440 445 

CAC CGC ATG GCA TGG GAT ATG ATG ATG AAC TGG TCC CCT ACA ACG GCG 1 392 
His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala 
450 455 460 

CTG GTA GTA GCT GAG CTG CTC AGG GTC CCA CAA GCC ATC TTG GAC ATG 1 440 
Leu Val Val Ala Gin Leu Leu Arg Val Pro Gin Ala lie Leu Asp Met 
465 470 475 480 
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ATC GCA GGT GCC C AC TGG GGA GTC CTA GCG GGC ATA GCG TAT TTC TCC 1 438 
lie Ala Gly Ala His Trp Gly Val Leu Ala Gly lie Ala Tyr Phe Ser 
485 490 495 

ATG GTG GGG AAC TGG GCG AAG GTC CTG GTA GTG CTG TTG CTG TTTTCC 1536 
Met Val Gly Asn Trp Ala Lys Val Leu Val Val Leu Leu Leu Phe Ser 
500 505 510 

GGCGTCGATGCGGCAACCTACACCACCGGGGGGAGCGTTGCTAGGACC 1584 
Gly Val Asp Ala Ala Thr Tyr Thr Thr Gly Gly Ser Val Ala Arg Thr 
515 520 525 

ACG CAT GG^ TTC TCC AGCTTATTC AGT CAA GGC GCC AAG CAG AAC ATC 1632 
Thr His Gly Phe Ser Ser Leu Phe Ser Gin Gly Ala Lys Gin Asn lie 
530 535 540 

CAG CTG ATT AAC ACC AAC GGC AGT TGG CAC ATC AATCGC ACG GCC TTG 1680 
Gin Leu lie Asn Thr Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu 
545 550 555 560 

AAC TGT AAT GCG AGC CTC GAC ACT GGC TGG GTA GCG GGG CTC TTC TAT 1728 
Asn Cys Asn Ala Ser Leu Asp Thr Gly Trp Val Ala Gly Leu Phe Tyr 
565 570 575 

TAG CAC AAA TTC AAC TCTTCA GGC TGC OCT GAG AGG ATG GCC AGC TGT 1776 
Tyr His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Met Ala Ser Cys 
580 585 590 

AGACCCCTTGCCGATTTTGACCAGGGCTGGGAATTCGAGCTCGGTACC 1824 
Arg Pro Leu Ala Asp Phe Asp Gin Gly Trp Glu Phe Glu Leu Gly Thr 
595 600 605 

CGG GGA TCC TCT AGA CTG CAG GCA TGC 1 851 

Arg Gly Ser Ser Arg Leu Gin Ala Cys 
610 615 



(2) INFORMATION FOR SEQ ID NO:35: 

(!) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 617 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(fi) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 
15 10 15 

Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 
20 25 30 
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Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 
35 40 45 

Thr Asp His Glu Asp Val Ala Arg Ala Val Glu AJa Ala Gly Gly Glu 
50 55 60 

Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 
65 70 75 80 

Giu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 
85 90 95 

Vai Gin Gly Asp Giu Pro Met lie Pro Ala Thr ile lie Arg Gin Val 
100 105 110 

Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 
115 120 125 

Pro lie His Asn Ala Glu Giu Ala Phe Asn Pro Asn Ala Val Lys Val 
130 135 140 

Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 
145 150 155 160 

Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 
165 170 175 

Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe Ile 
180 185 .190 

Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His Ile Glu Met 
195 200 205 

Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 
210 215 220 

Val Ala Gin Glu Vai Pro Gly Thr Gly Val Asp Thr Pro Giu Asp 1-eu 
225 230 235 240 

Asp Pro Ser Thr Arg Ile Arg Arg Ser Arg Asn Leu Gly Lys Val lie 
245 250 255 

Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile Pro Leu 
260 265 270 

Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val 
275 280 285 

Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 
290 295 300 

Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 
305 310 315 320 

Pro Ala Ser Ser Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 
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325 330 335 

Thr Asn Asp Cys Pro Asn Ser Ser lie Val Tyr Glu Thr Ala Asp Thr 
340 345 350 

He Leu His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Thr 
355 360 365 

Ser Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 
370 375 380 

Lys Leu Pro Ser Thr Gin Leu Arg Arg His lie Asp Leu Leu Val Gly 
385 390 395 400 

Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser 
405 410 415 

Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp 
420 425 430 

Thr Thr Gin Asp Cys Asn Cys Ser lie Tyr Pro Gly His lie Thr Gly 
435 440 445 

His Arg Met Aia Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala 
450 455 460 

Leu Val Val Ala Gin Leu Leu Arg Val Pro Gin Ala lie Leu Asp Met 
465-, 470 475 , -480 

lie Ala Gly Ala His Trp Gly Val Leu Ala Gly lie Ala Tyr Phe Ser 
485 490 495 

^ Met Val Gly Asn Trp Ala Lys Val Leu Val Val Leu Leu Leu Phe Ser 
500 505 510 

Gly Val Asp Ala Ala Thr Tyr Thr Thr Gly Gly Ser Val Ala Arg Thr 
515 520 525 

Thr His Gly Phe Ser Ser Leu Phe Ser Gin Gly Ala Lys Gin Asn lie 
530 535 540 

Gin Leu lie Asn Thr Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu 
545 550 555 560 

Asn Cys Asn Ala Ser Leu Asp Thr Gly Trp Val Ala Gly Leu Phe Tyr 
565 570 575 

Tyr His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Mel Ala Ser Cys 
580 585 590 

Arg Pro Leu Ala Asp Phe Asp Gin Gly Trp Glu Phe Glu Leu Gly Thr 
595 600 605 

Arg Gly Ser Ser Arg Leu Gin Ala Cys 
610 615 
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CLAIMS 



5. 



4. 



3. 



2. 



1 . 



A recombinant fusion protein SEQ. ID. NO. 1. 
A recombinant fusion protein SEQ. ID. NO. 2. 
A recombinant fusion protein SEQ. ID. NO. 3. 
A recombinant fusion protein SEQ. ID. NO. 4. 
A recombinant fusion protein SEQ. ID. NO. 5. 



1 0 



1 5 



2 0 



2 5 



3 0 



6 . A polypeptide SEQ. ID. NO. 1 . . 

7 . A polypeptide SEQ. ID. NO. 2. 

8 . A polypeptide SEQ. ID. NO. 3. 

9 . A polypeptide SEQ. ID. NO. 4. 

10. A polypeptide SEQ. ID. NO. 5. 

11. An assay for identifying the presence of an antibody immunoiogically 
reactive witfi an HCV antigen in a fluid sample comprising: 

contacting the sample with at least one polypeptide selected from the group 
consisting of recombinant fusion proteins SEQ. ID. NO. 1, SEQ. ID, NO. 2. SEQ. ID. 
NO. 3. SEQ. ID. NO. 4, SEQ. ID. NO. 5, and polypeptides SEQ. ID. NO. 1 , SEQ. ID. NO. 2. 
SEQ. ID. NO. 3, SEQ. ID. NO. 4. SEQ. ID. NO. 5 under conditions suitable for 
complexing the antibody with the polypeptide; and detecting the antibody- . 
polypeptide complex. 

12. In a confirmatory assay for identifying the presence of an antibody in 
a fluid sample immunologically reactive with an HCV antigen wherein the sample is 
used to prepare first and second immunologically equivalent altquots and the first 
aliquot is contacted with at least one polypeptide selected from the group consisting 
of recombinant fusion proteins SEQ. ID. NO. 1 , SEQ. ID. NO. 2, SEQ. ID. NO. 3, SEQ. 
ID. NO. 4, SEQ. ID. NO. 5. and polypeptides SEQ. ID; NO. 1 . SEQ. ID. NO. 2, SEQ. ID. 
NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5 under conditions suitable for complexing the 
antibody with the polypeptide and wherein the first antibody-antigen complex is 
detected, and: 

contacting the second aliquot with a polypeptide selected from the group 
consisting of recombinant fusion proteins SEQ. ID. NO. 1, SEQ. ID. NO. 2, SEQ. ID. 
NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5. and polypeptides SEQ. ID. NO. 1 , SEQ. ID. NO. 2, 
SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5 under conditions suitable to form a 
second antibody-antigen complex; and detecting the second antibody-antigen 
complex; wherein the polypeptide selected in the first aliquot is not the same as the 
polypeptide selected in the second aliquot. 
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13. In an immunodot assay for identifying the presence of an antibody 
immunologically reactive with an HCV antigen in a fluid sample wherein the sample 
is concurrently contacted with at least two polypeptides separately bound to distinct 

5 regions of the solid support, each containing distinct epitopes of an HCV antigen 
under conditions suitable for complexing the antibody with the polypeptide; and 
detecting the antibody-polypeptide complex, and 

wherein said polypeptides are selected from the group consisting of 
recombinant fusion proteins SEQ. ID. NO. 1 , SEQ, ID. NO. 2. SEQ. ID. NO. 3. SEC. ID. 
1 0 NO, 4. SEQ. ID. NO. 5. and polypeptides SEQ. ID. NO. 1 , SEQ. ID, NO. 2, SEQ. ID. NO. 3, 
SEQ. ID. NO. 4, SEQ. ID. NO. 5. 

14. In a competition assay for identifying the presence of an antibody 
immunologically reactive with an HCV antigen in a fluid sample wherein the sample 
is used to prepare first and second immunologically equivalent aliquots wherein the 

1 5 first aiiquot is contacted with a polypeptide bound to a solid support under 

conditions suitable for complexing the antibody with the polypeptide to form a 
detectable antibody-polypeptide complex, and wherein the second aliquot is first 
contacted with unbound polypeptide and then contacted with said bound polypeptide 
wherein the polypeptide is selected from the group consisting of recombinant fusion 

2 0 proteins SEQ, ID. NO. 1 , SEQ. ID. NO. 2. SEQ. ID. NO. 3, SEQ. ID, NO. 4. SEQ. ID. NO. 

5, and polypeptides SEQ. ID. NO. 1 . SEQ. ID. NO. 2. SEQ. ID. NO. 3, SEQ. |D. NO 4. 
SEQ. ID. NO. 5. 

15. In a competition assay for identifying the presence of an antibody 
immunologically reactive with an HCV antigen in a fluid sample wherein the sample 

2 5 is used to prepare first and second immunologically equivalent aliquots wherein the 

first aliquot is contacted with a polypeptide bound to a solid support under 
conditions suitable for complexing the antibody with the polypeptide to form a 
detectable antibody-polypeptide complex and wherein the second aliquot is first 
contacted with unbound polypeptide and then contacted with said bound polypeptide 

3 0 wherein the polypeptide is selected from the group consisting of recombinant fusion 

proteins SEQ. ID. NO. 1 , SEQ. ID. NO. 2, SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 
5, and polypeptides SEQ. ID. NO. 1. SEQ. ID. NO. 2. SEQ. ID. NO. 3, SEQ. ID. NO. 4, 
SEQ. ID, NO. 5; wherein the second aiiquot is contacted with untx)und and bound 
polypeptide simultaneously. 
3 5 16. In a neutralization assay for identifying the presence of an antibody 

immunologically reactive with an HCV antigen in a fluid sample wherein the sample 
is used to prepare first and second immunologically equivalent aliquots wherein the 
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first aliquot is contacted with a polypeptide bound to a solid support under 
conditions suitable for complexing the antibody with the polypeptide to form a 
detectable antibody-polypeptide complex wherein the bound polypeptide is selected 
from the group consisting of recombinant fusion proteins SEQ. ID. NO. 1, SEQ. ID. 
5 NO. 2. SEQ. ID. NO. 3, SEQ. ID. NO. 4. SEQ. ID. NO. 5. and polypeptides SEQ. ID. NO. 1 . 
SEQ. ID. NO. 2, SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5; 

and wherein the second aliquot is first contacted with unbound polypeptide 
and then contacted with said bound polypeptide wherein the unbound polypeptide is 
selected from the group consisting of recombinant fusion proteins SEQ. ID. NO. 1. 
1 0 SEQ. ID. NO. 2. SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5. and polypeptides SEQ. 
ID. NO. 1, SEQ. ID. NO. 2, SEQ. ID. NO. 3, SEQ. ID. NO. 4. SEQ. ID. NO. 5 and wherein 
the bound polypeptide selected is not the same as the same as the unbound 

polypeptide selected. 

17. In a neutralization assay for identifying the presence of an antibody 

1 5 immunologically reactive with an HCV antigen in a fluid sample wherein the sample 

is used to prepare first and second immunologically equivalent aiiquots wherein the 
first aliquot is contacted with a polypeptide bound to a solid support under 
conditions suitable for complexing the antibody with the polypeptide to form a 
detectable antibody-polypeptide complex wherein the bound polypeptide is selected 

2 0 from the group consisting of recombinant fusion proteins SEQ. ID. NO. 1 . SEQ. ID. 

NO. 2. SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5. and polypeptides SEQ. ID. NO. 1 . 
SEQ. ID. NO. 2. SEQ. ID. NO. 3. SEQ. ID. NO. 4, SEQ. ID. NO. 5; 

and wherein the second aliquot is first contacted with unbound polypeptide 
and then contacted with said bound polypeptide wherein the unbound polypeptide is 

2 5 selected from the group consisting of recombinant fusion proteins SEQ. ID. NO. 1, 

SEQ. ID. NO. 2. SEQ. ID. NO. 3, SEQ. ID. NO. 4. SEQ. ID. NO. 5. and polypeptides SEQ. 
ID. NO. 1 , SEQ. ID. NO. 2. SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5; 

and wherein the bound polypeptide selected is not the same as the unbound 

polypeptide selected; 

3 0 and wherein the second aliquot is contacted with unbound and bound 

polypeptide simultaneously. 

18. An immunoassay MX comprising: 

a polypeptide containing at least one HCV antigen selected from the group 
consisting of recombinant fusion proteins SEQ. ID. NO. 1. SEQ. ID. NO. 2. SEQ. ID. 
3 5 NO- 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5, and polypeptides SEQ. ID. NO. 1 . SEQ. ID. NO. 2, 
SEQ. ID. NO. 3, SEQ. ID. NO. 4, SEQ. ID. NO. 5; 

one or more sample preparation reagents; 
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and one or more detection and signal producing reagents. 
19. A kit of claim 18 wherein the polypeptides are bound to a solid 
support. 
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ASSAY WITH ClOO-3 


ASSAf WITH pHCV-31 
pnCv-OH 




SAMPLE 


MANUAL 

s/co 


MANUAL 
S/CO 


CONFIRMATORY 
RESULTS 


1 


>5.88 (+) 


>5.65 {+) 




2 


0.63 


0.54 




3 


>5.88 (-•-) 


>5.65 (+) 




4 


>5.88 (+) 


>5.65 {+) 


+ 


5 


0.43 


0.46 




6 


>5.88 (+) 


>5.65 (+) 


+ 


7 


0.46 


0.61 




8 


0.41 


0.70 




9 


1.87 (+) 


1.83 (+) 


+ 


1 0 


0.35 


4.88 ( + ) 




1 1 


0.48 


0.49 




1 2 


0.32 


0.50 




1 3 


0.48 


0.83 




1 4 


0.37 


0.37 




1 5 


>S.a8 (+) 


>S.65 (+) 




- 1 6 


>5.88 (+) 


>5.65 (+> 




1 / 


0-34 


0.44 




1 o 


3.01 (+) 


2.33 (+) 


+ 


1 9 


0.74 


0.72 






0.53 


0.76 




^ 1 


>5.oo (+) 


>5.65 (+) 


+ 


o o 


0.^4 


0.30 




O Q 
^ o 


>b.Bo (+) 


>5.65 (-f) 






f\ eft 

0.69 


0.84 




25 


0.50 


0 75 




26 


3.41 (^-) 


2.38 (+) 




27 


0.62 


0.82 




28 


0.61 


0.53 




2 9 


0.34 


4.94 (+) 




30 


1.58 (+) 


1.85 (+) 




31 


0.32 


0.52 




32 


>5.88 (+) 


>5.65 (+) 


+ 


33 


0.45 


0.58 





FIG. 14 

SUBSTITUTE SHEET 

SNSDOCID: -cWO 930408BA1_)_> 



wo 93/04088 



PCr/US92/07188 



34 


>5.88 (■(■) 


>5.65 (+) 




35 


>5.BB (<4-) 


>5.65 (+) 


+ 


36 


0.37 


0.44 
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>5.65 (+) 




3 9 • 


0.40 


1.10 ( + ) 




40 


0.53 


0.63 




41 


0.41 


0.34 




42 


0.52 


0.70 




43 


0.28 


0.44 




44 


0.44 


0.70 





S/CO=i 



Sample OP 
Cutoff OD 
S/CO = <1.0 is non-reactive 
S/CO = ^1.0 Is reactive 

•This specimen was negative when retested in duplicate. 
0.56 and 0.51.) 
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PATIENT 


DATE 


ALT 
lU/L 


ASSAY WITH 
C-IOO-3 


AQQAv wrru 

MOOMT W 1 1 n 

pHCV-31, 
pHCV-34 


CONFIRMATORy 
RESULTS 


1 


10/28/85 


474 


0.30 (-) 


2.12 (+) 






11/11/85 


113 


0.38 (-) 


4.72 (+) 






1 2/03/85 


86 


3.13 


>5.65 






01/09/86 


142 


>5.61 


NT 


NT 




03/1 9/8 6 


90 


>5.61 (+) 


>5.65 (+) 


4- 




09/30/86 


25 


>5.61 (+) 


>6.67 (+) 


4- 














2 


09/1 4/87 


217 


5 02 (4-) 


5 84 l+\ 






09/1 7/87 


210 


>5 61 


6 58 7+^ 
















3 


1 n/np/ft 7 


116 


1 61 


















A 


1 1 /O 4 /ft 7 


MA 






+ 




1 2/1 7/fl7 


NA 


0 47 ^ - ^ 


1 P7 f-4.\ 


t 




01/1 3/aa 


NA 


0 46 (') 


1 56 






02/2 1/8 8 


NA 


0 34 / - \ 


1 45 
















7 


1 0/02/85 


298 


0.79 (-) 


2 94 






1 0/07/85 


548 


0.86 {-) 


2.68 (+) 






1 0/23/85 


334 


2.06 (+) 


2.32 (+) 
















1 0 


01/25/89 


NA 


0.57 (-) 


2.66 (+) 






02/01/89 


NA 


1.08 (+) 


2 80 






02/08/89 


NA 


1.75 (+) 


3 38 (+) 






02/23/89 


NA 


2.22 (+) 


2 56 f+) 






03/01/89 


NA 


1.94 (4) 


3.21 (+) 






03/08/89 


NA 


1.64 (+) 


2.52 (+) 






03/22/89 


NA 


1.49 (+) 


1.76 (+) 


4. 




04/1 2/89 


NA 


2.69 (+) 


5 29 






04/26/89 


NA 


2.77 . (+) 


>5.65 (+) 


+ 




05/17/89 


NA 


2.19 (+) 


2. 82 (+) 


+ 














1 3 


10/05/88 


NA 


0.31 f-) 


0.51 {-) 


NT 




10/19/88 


NA 


0.40 (-) 


0.61 (-) 


NT 




1 0/28/88 


NA 


0.33 (-) 


0.53 (-) 


NT 




1 1/0 9/8 8 


NA 


0.33 (-) 


0.54 (-) 


NT 




1 1/1 1/88 


NA 


0.37 (-) 


0,66 (.) 


NT 
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0.44 (-) 


0.65 (-) 






1 2/05/88 


NA 


0.51 (.) 


0.74 (.) 
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0.28 (-) 


0.68 (-) 
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0.29 (-) 


0.64 (-) 


NT 
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0.33 (-) 


1.11 (+) 
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1.11 (+) 


+ 
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0.26 (-) 
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1.88 (+) 
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0.26 (-) 


>5.65 (+) 
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0-26 (-) 


>5.65 (+) 






04/20/89 


NA 


0.29 (-) 


>5.65 (+) 


-f 




04/28/89 


NA 


0.31 (-) 


>5.65 (+) 


4- 




05/05/89 


NA 


0.28 (-) 


>5.65 (+) 


+ 




07/03/89 


NA 


0.23 (-) 


5.32 (+) 
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10/05/88 


1454 


0.53 (-) 


0.95 (.) 


NTT 




1 0/20/88 


612 


0.57 (-) 


2.04 (+) 


+ 




1 0/28/88 


576 


0.56 (-) 


1.26 (+) 






1 1/09/88 


306 


0.54 (-) 


1.39 (+) 






1 1/1 1/8 8 


321 


0.73 (-) 


1,34 (+) 






1 1/1 8/88 


341 


0.83 f-) 


1.43 (+) 


+ 




1 1/25/88 


333 


0.73 (-) 


1.83 (+) 






1 2/05/88 


232 


0.75 (-) 


1.92 (+) 


+ 




1 2/1 6/88 


239 


0.81 (-) 


2.75 f+) 






1 2/23/88 


198 


1.20 (+) 


3.42 (+) 


+ 




01/1 3/89 


146 


3.17 (+) 


>5.65 {+) 


+ 




01/27/89 


104 


4.36 (+) 


>6.67 (+) 






02/1 7/89 


113 


>5.61 (+) 


>6.67 (+) 






02/24/89 


120 


>5.61 (+) 


>6.67 (+) 
















1 8 


01/13/89 


112 


>5.61 (+) 


>5-65 (+) 






01/21/89 


72 


>5.61 (+) 


>5.65 (+) 






01/28/89 


181 


>5.61 (+) 


>5.67 (+) 


+ 




02/08/89 


1 06 


>S.61 (+) 


>5,65 (+) 
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(+) 
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0.35 (-) 
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(+) 
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.0.32 (-) 


1.54 


(+) 
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04/28/8 9 


NA 


0.29 (-) 


1.04 


(+) 


+ 




05/05/89 


NA 


0.36 (-) 


1.16 


{+) 


+ 




07/03/89 


NA 


0.30 (-) 


1 .24 


(+1 


+ 













NT = Not Tested 
NA = Not Available 



F\G. I6B 



SUBSTITUTE SHEET 

BNSDCX;tD: ^WO g304088Al.L> 



wo 93/04088 PCr/US92/07i88 




3niVA OD/S 

SUBSTITUTE SHEET 



BNSDOCID: <:WO 93040BBA1_I_> 



wo 93/04088 



PCr/US92/07188 



LU UJCC 



CO — 



LU 

UJ o ' 

o ccHI a: 

CL. ^ 

<t <x 
d f-co I 

^ <t CO > 
UJ< o 

OTOD ^ 



d S 



O 

o 



LU 



CD 



to 



<I 

LU 

CD 
<I 



>- 
<t 
CO 
CO 

<I 

fO 

I 

o 
o 



LU O 

CL 

LU >- 

or CD 



CO 
LU 

o 

LU 
Q- 
CO 



cc 
o 
cr> 

LU 

I— 
<r 
o 



lO 

ro 



5? 
O 



CO 

ro 



O 

CO 
Ll. 

in 
2: 
<t 
cc 
I— 

I 

CO 

o 

LuX 

o<x 



o 
o 

o 
o 



o 
o 

d 



CM 



o 
o 

o 

CJ 



LU 

q: 



o 2: 



OD 
CD 



o 






ro 






ro 






1 

CO o 






=^ o ^ 












^ LU 

^ ™ uZ 
^ ^ 


o 


Q 


CL Q ^ 






CO ^ <t 












d q: 






2: lZ 






2r 






o 






c^ 






LU 












2: o 1^ 




















^LU^ 
CL ^ 1 — 






CO g Q- 






O Ll- CL 












O 






O 












CO 






S- 






2 to 






LU 

S ^ 
^ CD Q 






P Q :r 
ii-* S t" 


O 


o 


Cl. Q- 






CO S LU 






CE n 
















o 






o 






Q > > 






S CJ> 






=3 o n: 






O <5 Q. 






Lu LU ^ 












CIMEN 
NALLY 
pHCV- 




CM 






LU O ^ 

P ^ 
CO — ^ 










Q CO 












s <t <I 














USIO 






ED 




Lu 




RY 


CO 
<t 




o 


cr 


s ^ 


CD 


»— 

1 




LU 




h- 




>- o 


<t 


POS 


t: ^ 








LU nr 

1— CD 


S CD 










o<t 


o <: 




<3: z 


CJ z 



22 

CD 



BNSDOCID: -kWO 93040B8A1„L> 



SUBSTITUTE SHEET 



wo 93/04088 



PCT/US92/071'88 



<c 

CO 
CO 

to 
I 

> 
o 

X- 

Q. 

> 
o 
zn 



>- 
*^ 
CO 
CO 

<x 

ro 
I 

o 
o 

t 

o 



o 

UJ 



o 



Q- cj 

en LU 
en 



o 



^£ 

LU 

LU <r 
cr LU 
q: 



LU 



>- 

or 
o 

C3 



o 



GO 



CD CD 



CD 
CD 



cr> 

CD 



o 

ro 

GO 



o 



LU 

o 



o 
ct: 

X 

o 



cn 



CJ> 



cn 



o 



CD 



CO 
CO 

Q:: 

LU 
D- 

O 

O 
CC 
X 
CO 



moo 

r-CO 



ro 
ro 



o 

in CD 

^ CO 



X 
CD 



CO 



ro 

ro^* 
rocr> 



in 



in, 
c^J, 



in 
ro 



CD 



5i 



to 
in 



o 

m 



ro 



_ CO 



CO 



GO 
<^ 

O 

or 

X 
-U 

h- 

o 



o 

OJ 
CD 



SUBSTITUTE SHEET 

BNSDCX:iD: -cWO 9304088A1„I. > 



wo 93/04088 



PCr/US92/07188 



CO 

O 



O 
O 

CD 



O 
Q. 
CO 

LU 

9 

Hi 

o 

a. 

> 
O 



LU 



m 

CD 



O 

CL. 
CO 



O 

CO 



CO 


CO 


CO 


CO 


X 


X 


CL 


CL 


CO" 


co" 


Q 


Q 


CO 


CO 


^ 




in 


in 






o 


o 


o 


d 


o" 


D 




CD 














O) 


cn 


o 


o 



in 

CD 

X 

Cl 



O O 

X X 

I I 

CO CO 



E E 
o o 

CM C\J 



xr 

CO 

o 

CL 
CO 



o o 
o o 



o 
o 



o 

cvi 

T— 

X 

Q- 



S CD 
O CM 

>< s 

CM 

f- O 

o o 



0) 
CD 

CL 
10 

o 

CL 
CD 



E E 
o o 

lO to 



O O O o 

in lo in o 



in 



CD 



UJ 

o 

cc 
a. 

Q 



CO 

5 



o 
o 

T— 

a 



O 
CD 



CO 
CM 



o 

CO 



CO CO 



CD 
CM 



LU 
X 

O 



CO O 



CO 

o o o 



CO 



> > > 

o o o 

XXX 
Q. CL CL 



SUBSTITUTE SHEET 



9304088A1 I > 



wo 93/04088 



PCr/US92/07i88 



ANTIGEN 

C100-3 

pHCV-23 

pHCV-29 

pHCV-34 



REFLECTANCE DENSITY VALUES 
NEGATIVE MEAN CUTOFF 
0.023 • 0.129 
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